JavaCodeByJava: February 2022

Wednesday, 23 February 2022

show timestamp on elastic index

1. during create index must add a column with type data like below:

PUT /sample2

{

"settings" : {

"number_of_shards" : 3,

"number_of_replicas" : 2

}

PUT /sample2/_mapping

{

"properties":{

"name":{"type":"text"}

"date":{"type":"date"}

}

or directly insert

POST /sample1/_doc/1

{

"author": "Beverly Sills2",

"content": "You may be disappointed if you fail, but you are doomed if you don’t try1.",

"year": 1930,

"timestamp":"2022-02-23T16:12:11"

}

-----------------------

open http://localhost:5601/

--> stack manegement--> index pattern ==> http://localhost:5601/app/management/kibana/indexPatterns

Create new index pattern---> set name and select time stamp field--> save

now you can filter basis on time stamp also.

Tuesday, 15 February 2022

Kafka basic

Kafka Architecture

Kafka consists of Records, Topics, Consumers, Producers, Brokers, Logs, Partitions, and Clusters. Records can have key (optional), value and timestamp. Kafka Records are immutable. A Kafka Topic is a stream of records ("/orders", "/user-signups"). You can think of a Topic as a feed name. A topic has a Log which is the topic’s storage on disk. A Topic Log is broken up into partitions and segments. The Kafka Producer API is used to produce streams of data records. The Kafka Consumer API is used to consume a stream of records from Kafka. A Broker is a Kafka server that runs in a Kafka Cluster. Kafka Brokers form a cluster. The Kafka Cluster consists of many Kafka Brokers on many servers. Broker sometimes refer to more of a logical system or as Kafka as a whole.

Cloudurable provides Kafka training, Kafka consulting, Kafka support and helps setting up Kafka clusters in AWS.

Kafka Architecture: Topics, Producers and Consumers

Kafka Architecture - Topics, Producers and Consumers Diagram

Kafka uses ZooKeeper to manage the cluster. ZooKeeper is used to coordinate the brokers/cluster topology. ZooKeeper is a consistent file system for configuration information. ZooKeeper gets used for leadership election for Broker Topic Partition Leaders.

Kafka Architecture: Core Kafka

Kafka Architecture - Core Kafka Diagram

Kafka needs ZooKeeper

Kafka uses Zookeeper to do leadership election of Kafka Broker and Topic Partition pairs. Kafka uses Zookeeper to manage service discovery for Kafka Brokers that form the cluster. Zookeeper sends changes of the topology to Kafka, so each node in the cluster knows when a new broker joined, a Broker died, a topic was removed or a topic was added, etc. Zookeeper provides an in-sync view of Kafka Cluster configuration.

Kafka Producer, Consumer, Topic details

Kafka producers write to Topics. Kafka consumers read from Topics. A topic is associated with a log which is data structure on disk. Kafka appends records from a producer(s) to the end of a topic log. A topic log consists of many partitions that are spread over multiple files which can be spread on multiple Kafka cluster nodes. Consumers read from Kafka topics at their cadence and can pick where they are (offset) in the topic log. Each consumer group tracks offset from where they left off reading. Kafka distributes topic log partitions on different nodes in a cluster for high performance with horizontal scalability. Spreading partitions aids in writing data quickly. Topic log partitions are Kafka way to shard reads and writes to the topic log. Also, partitions are needed to have multiple consumers in a consumer group work at the same time. Kafka replicates partitions to many nodes to provide failover.

Kafka Architecture: Topic Partition, Consumer group, Offset and Producers

Kafka Architecture: Topic Partition, Consumer group, Offset and Producers Diagram

Kafka Scale and Speed

How can Kafka scale if multiple producers and consumers read and write to same Kafka topic log at the same time? First Kafka is fast, Kafka writes to filesystem sequentially which is fast. On a modern fast drive, Kafka can easily write up to 700 MB or more bytes of data a second. Kafka scales writes and reads by sharding topic logs into partitions. Recall topics logs can be split into multiple partitions which can be stored on multiple different servers, and those servers can use multiple disks. Multiple producers can write to different partitions of the same topic. Multiple consumers from multiple consumer groups can read from different partitions efficiently.

Kafka Brokers

A Kafka cluster is made up of multiple Kafka Brokers. Each Kafka Broker has a unique ID (number). Kafka Brokers contain topic log partitions. Connecting to one broker bootstraps a client to the entire Kafka cluster. For failover, you want to start with at least three to five brokers. A Kafka cluster can have, 10, 100, or 1,000 brokers in a cluster if needed.

Kafka Cluster, Failover, ISRs

Kafka supports replication to support failover. Recall that Kafka uses ZooKeeper to form Kafka Brokers into a cluster and each node in Kafka cluster is called a Kafka Broker. Topic partitions can be replicated across multiple nodes for failover. The topic should have a replication factor greater than 1 (2, or 3). For example, if you are running in AWS, you would want to be able to survive a single availability zone outage. If one Kafka Broker goes down, then the Kafka Broker which is an ISR (in-sync replica) can serve data.

Kafka Failover vs. Kafka Disaster Recovery

Kafka uses replication for failover. Replication of Kafka topic log partitions allows for failure of a rack or AWS availability zone (AZ). You need a replication factor of at least 3 to survive a single AZ failure. You need to use Mirror Maker, a Kafka utility that ships with Kafka core, for disaster recovery. Mirror Maker replicates a Kafka cluster to another data-center or AWS region. They call what Mirror Maker does mirroring as not to be confused with replication.

Note there is no hard and fast rule on how you have to set up the Kafka cluster per se. You could, for example, set up the whole cluster in a single AZ so you can use AWS enhanced networking and placement groups for higher throughput, and then use Mirror Maker to mirror the cluster to another AZ in the same region as a hot-standby.

Kafka Architecture: Kafka Zookeeper Coordination

Kafka Architecture - Kafka Zookeeper Coordination Diagram

Kafka Topics Architecture

Please continue reading about Kafka Architecture. The next article covers Kafka Topics Architecture with a discussion of how partitions are used for fail-over and parallel processing.

http://cloudurable.com/blog/kafka-architecture/index.html#:~:text=A%20Broker%20is%20a%20Kafka,as%20Kafka%20as%20a%20whole.

Monday, 14 February 2022

get parent branch in git

git show-branch -a \
	\| grep '\*' \
	\| grep -v `git rev-parse --abbrev-ref HEAD` \
	\| head -n1 \
	\| sed 's/.\[\(.\)\].*/\1/' \
	\| sed 's/[\^~].*//'

Saturday, 12 February 2022

Kafka Partition

In previous examples, the ProducerRecord objects we created included a topic name,

key, and value. Kafka messages are key-value pairs and while it is possible to create a

ProducerRecord with just a topic and a value, with the key set to null by default,

most applications produce records with keys. Keys serve two goals: they are addi‐

tional information that gets stored with the message, and they are also used to decide

which one of the topic partitions the message will be written to. All messages with the

same key will go to the same partition. This means that if a process is reading only a

subset of the partitions in a topic (more on that in Chapter 4), all the records for a

single key will be read by the same process. To create a key-value record, you simply

create a ProducerRecord as follows:

ProducerRecord<Integer, String> record =

new ProducerRecord<>("CustomerCountry", "Laboratory Equipment", "USA");

When creating messages with a null key, you can simply leave the key out:

ProducerRecord<Integer, String> record =

new ProducerRecord<>("CustomerCountry", "USA");

Here, the key will simply be set to null, which may indicate that a customer

name was missing on a form.

When the key is null and the default partitioner is used, the record will be sent to

one of the available partitions of the topic at random. A round-robin algorithm will

be used to balance the messages among the partitions.

If a key exists and the default partitioner is used, Kafka will hash the key (using its

own hash algorithm, so hash values will not change when Java is upgraded), and use

the result to map the message to a specific partition. Since it is important that a key is

always mapped to the same partition, we use all the partitions in the topic to calculate

the mapping—not just the available partitions. This means that if a specific partition

is unavailable when you write data to it, you might get an error. This is fairly rare, as

you will see in Chapter 6 when we discuss Kafka’s replication and availability.

The mapping of keys to partitions is consistent only as long as the number of parti‐

tions in a topic does not change. So as long as the number of partitions is constant,

you can be sure that, for example, records regarding user 045189 will always get writ‐

ten to partition 34. This allows all kinds of optimization when reading data from par‐

titions. However, the moment you add new partitions to the topic, this is no longer

guaranteed—the old records will stay in partition 34 while new records will get writ‐

ten to a different partition. When partitioning keys is important, the easiest solution

is to create topics with sufficient partitions (Chapter 2 includes suggestions for how

to determine a good number of partitions) and never add partitions.

Kafka Producer

We start producing messages to Kafka by creating a ProducerRecord, which must

include the topic we want to send the record to and a value. Optionally, we can also

specify a key and/or a partition. Once we send the ProducerRecord, the first thing the

producer will do is serialize the key and value objects to ByteArrays so they can be

sent over the network.

Next, the data is sent to a partitioner. If we specified a partition in the

ProducerRecord, the partitioner doesn’t do anything and simply returns the partition

we specified. If we didn’t, the partitioner will choose a partition for us, usually based

on the ProducerRecord key. Once a partition is selected, the producer knows which

topic and partition the record will go to. It then adds the record to a batch of records

that will also be sent to the same topic and partition. A separate thread is responsible

for sending those batches of records to the appropriate Kafka brokers.

When the broker receives the messages, it sends back a response. If the messages

were successfully written to Kafka, it will return a RecordMetadata object with the

topic, partition, and the offset of the record within the partition. If the broker failed

to write the messages, it will return an error. When the producer receives an error, it

may retry sending the message a few more times before giving up and returning an

error

Once we instantiate a producer, it is time to start sending messages. There are three

primary methods of sending messages:

Fire-and-forget

We send a message to the server and don’t really care if it arrives succesfully or

not. Most of the time, it will arrive successfully, since Kafka is highly available

and the producer will retry sending messages automatically. However, some mes‐

sages will get lost using this method.

Synchronous send

We send a message, the send() method returns a Future object, and we use get()

to wait on the future and see if the send() was successful or not.

Asynchronous send

We call the send() method with a callback function, which gets triggered when it

receives a response from the Kafka broker.

acks

The acks parameter controls how many partition replicas must receive the record

before the producer can consider the write successful. This option has a significant

impact on how likely messages are to be lost. There are three allowed values for the

acks parameter: 0, 1, all

JavaCodeByJava