Если вы видите что-то необычное, просто сообщите мне. Skip to main content

Подключение Kafka к PostgreSQL

DoИнструкция youпоможет wantвам toвзять transferна yourсебя отвественность без проблем и без потери эффектисновсти. Цель статьи в создании процесса экспорта данных настолько гладко, насколько это возможно.

В конце статьи вы сможете успешно подключать Kafka к PostgreSQL, плавно передавать данные потребителю по выбору, для полноценного анализа в реальном времени. В дальнейшем это повзолит пострить гибкий ETL(дословно «извлечение, преобразование, загрузка») конвеер для вашей организации. Из стати вы узнаете более глубокое понимание инструментов и техник и таким образом оно поможет вам отточить ваши умения дальше.

Требования

Для лучшего пониманиния статьи, требуется понимание следующего списка тем:

    Знания PostgreSQL. Знания Kafka Kafka и PostgreSQL dataустановленны usingна Kafka?хосте. Are you

    Введение finding it challenging to connectв Kafka to PostgreSQL? Well, look no further! This article will answer all your queries & relieve you of the stress of finding a truly efficient solution. Follow our easy step-by-step guide to help you master the skill of efficiently transferring your data from PostgreSQL using Kafka.

    It will help you take charge in a hassle-free way without compromising efficiency. This article aims at making the data export process as smooth as possible.

    Upon a complete walkthrough of the content, you will be able to successfully connect Kafka to PostgreSQL to seamlessly transfer data to the destination of your choice for a fruitful analysis in real-time. It will further help you build a customized ETL pipeline for your organization. Through this article, you will get a deep understanding of the tools and techniques & thus, it will help you hone your skills further. Table of Contents

    Prerequisites
    Introduction to Kafka
    Introduction to PostgreSQL
    Methods to Set up Kafka to PostgreSQL Integration
        Method 1: Manual process to Set up Kafka to PostgreSQL Integration
            Step 1: Installing Kafka
            Step 2: Starting the Kafka, PostgreSQL & Debezium Server
            Step 3: Creating a Database in PostgreSQL
            Step 4: Enabling the Kafka to PostgreSQL Connection
        Method 2: Using Hevo to Set up Kafka to PostgreSQL Integration
    Conclusion
    

    Prerequisites

    You will have a much easier time understanding the ways for setting up the Kafka to PostgreSQL Integration if you have gone through the following aspects:

    Working knowledge of PostgreSQL.
    Working knowledge of Kafka.
    PostgreSQL is installed at the host workstation.
    Kafka is installed at the host workstation.
    

    Introduction to Kafka Kafka Logo.

    Apache Kafka isэто anпродукт open-sourceс messageоткрытым queueисходным thatкодом, helpsкоторый publishпомогает &публиковать subscribeи highподписываться volumeна ofбольшие messagesпо inобъему aсообщения distributedв manner.распределенной It makes use of the leader-follower concept, allowing users to replicate messages in a fault-tolerant way and further allows to segment & store messages inсистеме. Kafka Topicsиспользует dependingидею uponлидер-последователь, theпозволяя subject.пользователя копировать сообщения в независимые от падения, и в дальнейшем позволядет делить и хранить сообщения в Kafka allowsтопиках settingв upзависимости real-timeот streamingтемы dataсообщения. pipelinesKafka &позволяет applicationsнастраивать toв transformреальном theвремеи dataпотоки andданных streamи dataприложения fromдля sourceизменения toданных target.и потоков от источника к цели.

    KeyКлючевые Features ofособенности Kafka:

    Scalability:
      Масштабируемость: Kafka hasимеет exceptionalисключительную scalabilityмасштабируемость andи canможет beбыть scaledотмасштабированно easilyбез withoutвремени downtime.простоя. DataИзменение Transformation:данных: Kafka offersпредлагает KStream andи KSQLKSQL(в (in the case ofслучае Confluent Kafka) forдля on-the-flyизменению dataданных transformation.на Fault-Tolerant:лету. Отказоустойчивость: Kafka usesиспользует brokersпосредников toдля replicateкопирования dataданных andи persistsпостоянства theданных, dataдля toсоздания makeотказоустойчивых itсистем. a fault-tolerant system. Security:Безопасность: Kafka canможет beбыть combinedобъеденина withс variousразличными securityметриками measuresбезопасности likeтакими Kerberosкак toKerberos, streamдля dataпередачи securely.информации Performance:конфиденциально. Производительность: Kafka isраспределенна, distributed,разделена partitioned,и andимеет hasочень aвысокую veryпропускную highспособность throughputдля forпубликации publishingи andподписки subscribingна toсообщения. the messages.

      ForДля furtherболее informationподробного onописания, Kafka,можно youобратиться canна checkофициальный theсайт officialразработчиков websiteKafka

      here.

      Введение Introductionв to PostgreSQL PostgreSQL Logo.PostgreSQL.

      PostgreSQL isэто aмощное, powerful,производственного enterprise-class,класса, open-sourceс relationalоткрытым databaseисходным managementкодом systemСУБД thatкоторая usesиспользует standardстандартные SQL toзапросы queryсвязанных theданных relational data andи JSON toдля queryзапросов theнесвязанных non-relationalданных dataхранящихся residingв inбазе the database.данных. PostgreSQL hasимеет excellentотличную supportподдержку forдля allвсех ofоперационных theсистем. operatingОн systems.поддерживает Itрасширенные supportsтипы advancedданных dataи typesоптимизацию andопераций, optimizationкоторые operations,можно foundнайти inв commercialкомерческих databasesпроектах such asкаа Oracle, SQL Server,Server etc.и т.д.

      KeyКлючевые features ofособенности PostgreSQL:

      It
      hasИмеет extensiveрасширенную supportподдержку forдля complexсложных queries.запросов. ItПредоставляет providesотличную excellentподдержку supportдля forгеографических geographicобъектов objectsи &следовательно henceон itможет canбыть beиспользован usedдля forгеографической geographicинформационной informationсистемы systemsи &сервисе location-basedна services.основе Itположения. providesПредоставляет fullподдержку supportдля forклиент-серверной client-serverсетевой networkтехнологии architecture.Упреждающая Its журнализация(write-ahead-logging (WAL)) featureпозвляет makesбыть itбазе fault-tolerant.данных отказоустойчивой.

      ForДля furtherбольшей informationинформации onпо PostgreSQL, youможно canпосмотреть checkофициальный theвебсайт.

      official

      Процесс websiteручной here. Methods to Set upнастройки Kafka toи PostgreSQLPostgreSLQ Integration

      Method 1: Manual process to Set up Kafka to PostgreSQL Integration

      This method would require you to invest in the engineering team and bandwidth. This method involves the use of Debezium PostgreSQL Connector, building code to fetch data from Kafka, and loading data into PostgreSQL. Once the setup is completed, you would need to continuously monitor and maintain the infrastructure.

      Method 2: Using Hevo to Set up Kafka to PostgreSQL Integration

      Hevo Data is an automated Data Pipeline platform that can move your data from Kafka to PostgreSQL very quickly without writing a single line of code. It is simple, hassle-free, and reliable.

      Moreover, Hevo offers a fully-managed solution to set up data integration from 100+ data sources (including 30+ free data sources) and will let you directly load data to a Data Warehouse such as Snowflake, Amazon Redshift, Google BigQuery, etc. or the destination of your choice. It will automate your data flow in minutes without writing any line of code. Its Fault-Tolerant architecture makes sure that your data is secure and consistent. Hevo provides you with a truly efficient and fully automated solution to manage data in real-time and always have analysis-ready data.

      Explore more about Hevo Data by signing up for the 14-day trial today! Methods to Set up Kafka to PostgreSQL Integration

      This article delves into both the manual and Hevo methods in great detail. You’ll also learn about the advantages and disadvantages of each strategy, allowing you to choose the ideal method for your needs. Below are the two methods:

      Method 1: Manual process to Set up Kafka to PostgreSQL Integration
      Method 2: Using Hevo to Set up Kafka to PostgreSQL Integration
      

      Method 1: Manual process to Set up Kafka to PostgreSQL Integration

      интеграции

      Kafka supportsподдерживает connectingподключение withс PostgreSQL andи numerousразличными otherдругими databasesбазами withданных theс helpпомощью ofразличных variousвстроенных in-builtподключений. connectors.Эти Theseконнекторы connectorsпомогают helpпередавать bringданные inот dataисточника from a source of your choice toв Kafka andи thenзатем streamпередать itпотоком toв theцелевой destinationсервис ofс yourпомощью choiceвыбора fromтопиков KafkaKafka. Topics.Так Similarly,же, thereесть areмножество manyподключений connectorsдля forPostgreSQL, PostgreSQLкоторые thatпомогают helpустановить establishподключение a connection withк Kafka.

      You

      1) can set up theУстановка Kafka PostgreSQL connection with the Debezium PostgreSQL connector/image using the following steps:

      Step 1: Installing Kafka
      Step 2: Starting the Kafka, PostgreSQL & Debezium Server
      Step 3: Creating a Database in PostgreSQL
      Step 4: Enabling the Kafka to PostgreSQL Connection
      

      Step 1: Installing Kafka

      To connect Kafka to PostgreSQL, you will have to download and install Kafka, either on standalone or distributed mode. You can check out the following links & follow Kafka’s official documentation, that will help you get started with the installation process:

      Apache Kafka Installation Guide
      Confluent Kafka Installation Guide
      Debezium Kafka Installation Guide
      

      Step 2: Starting the Kafka, PostgreSQL & Debezium Server

      Confluent provides users with a diverse set of in-built connectors that act as the data source and sink, and help users transfer their data via Kafka. One such connector/image that lets users connect Kafka with PostgreSQL is the Debezium PostgreSQL Docker Image.

      To install the Debezium Docker that supports connecting PostgreSQL with Kafka, go to the official Github project of Debezium Docker and clone the project on your local system. Debezium Docker's Github Project.

      Once you have cloned the project, you need to start the Zookeeper services that store the Kafka configuration, Topic configuration, and manage Kafka nodes. You can do this using the following command:

      docker run -it --rm --name zookeeper -p 2181:2181 -p 2888:2888 -p 3888:3888 debezium/zookeeper:0.10

      Now with the Zookeeper up and running, you need to start the Kafka server. To do this, open a new console and execute the following command in it:

      docker run -it --rm --name kafka -p 9092:9092 --link zookeeper:zookeeper debezium/kafka:0.10

      Once you’ve enabled Kafka and Zookeeper, you now need to start the PostgreSQL server, that will help you connect Kafka to PostgreSQL. You can do this using the following command:

      docker run — name postgres -p 5000:5432 debezium/postgres

      Now with the PostgreSQL server up and running, you need to start the Debezium instance. To do this, open a new console and execute the following command in it:

      docker run -it — name connect -p 8083:8083 -e GROUP_ID=1 -e CONFIG_STORAGE_TOPIC=my-connect-configs -e OFFSET_STORAGE_TOPIC=my-connect-offsets -e ADVERTISED_HOST_NAME=$(echo $DOCKER_HOST | cut -f3 -d’/’ | cut -f1 -d’:’) — link zookeeper:zookeeper — link postgres:postgres — link kafka:kafka debezium/connect

      Once you’ve enabled all three servers, login to PostgreSQL command-line tool using the following command:

      psql -h localhost -p 5000 -U postgres

      This is how you can enable your Kafka, PostgreSQL, and Debezium instance servers to connect Kafka to PostgreSQL. Step 3: Creating a Database in PostgreSQL

      Once you’ve logged in to PostgreSQL, you now need to create a database. For example, if you want to create a database with the name “emp”, you can use the following command:

      CREATE DATABASE emp;

      With your database now ready, create a table in your database that will store the employee information. You can do this using the following command:

      CREATE TABLE employee(emp_id int, emp_name VARCHAR);

      You now need to insert data or a few records into the table. To do this, use the Insert Into command as follows: Inserting values into the Employee Table.

      This is how you can create a PostgreSQL database and insert values in it, to set up the Kafka to PostgreSQL connection. Step 4: Enabling the Kafka to PostgreSQL Connection

      Once you’ve set up your PostgreSQL database, you need to enable the Kafka & PostgreSQL connection, which will pull the data from PostgreSQL and push it to the Kafka Topic. To do this, you can create the Kafka connection using the following script:

      curl -X POST -H “Accept:application/json” -H “Content-Type:application/json” localhost:8083/connectors/ -d ‘ { “name”: “emp-connector”, “config”: { “connector.class”: “io.debezium.connector.postgresql.PostgresConnector”, “tasks.max”: “1”, “database.hostname”: “postgres”, “database.port”: “5432”, “database.user”: “postgres”, “database.password”: “postgres”, “database.dbname” : “emp”, “database.server.name”: “dbserver1”, “database.whitelist”: “emp”, “database.history.kafka.bootstrap.servers”: “kafka:9092”, “database.history.kafka.topic”: “schema-changes.emp” } }’

      You can now check and verify the connectors using the following line of code:

      curl -X GET -H “Accept:application/json” localhost:8083/connectors/emp-connector

      To verify if Kafka is correctly pulling data from PostgreSQL or not, you can enable the Kafka Console Consumer using the following command: Enabling Kafka Consumer Console

      The above command will now display your PostgreSQL data on the console. With Kafka now correctly pulling data from PostgreSQL, you can use KSQL/KStream or Spark Streaming to perform ETL on the data.

      This is how you can connect Kafka to PostgreSQL using the Debezium PostgreSQL connector.