Home Big Data Introducing persistent buffering for Amazon OpenSearch Ingestion

Introducing persistent buffering for Amazon OpenSearch Ingestion

Introducing persistent buffering for Amazon OpenSearch Ingestion


Amazon OpenSearch Ingestion is a totally managed, serverless pipeline that delivers real-time log, metric, and hint information to Amazon OpenSearch Service domains and OpenSearch Serverless collections.

Prospects use Amazon OpenSearch Ingestion pipelines to ingest information from a wide range of information sources, each pull-based and push-based. When ingesting information from pull-based sources, equivalent to Amazon Easy Storage Service (Amazon S3) and Amazon MSK utilizing Amazon OpenSearch Ingestion, the supply handles the info sturdiness and retention. Push-based sources, nonetheless, stream data on to ingestion endpoints, and usually don’t have a way of persisting information as soon as it’s generated.

To deal with this want for such sources, a typical architectural sample is so as to add a persistent standalone buffer for enhanced sturdiness and reliability of knowledge ingestion. A sturdy, persistent buffer can mitigate the influence of ingestion spikes, buffer information throughout downtime, and scale back the necessity to broaden capability utilizing in-memory buffers which may overflow. Prospects use standard buffering applied sciences like Apache Kafka or RabbitMQ so as to add sturdiness to their information flowing by their Amazon OpenSearch Ingestion pipelines. Nevertheless, these instruments add complexity to the info ingestion pipeline structure and may be time consuming to setup, right-size, and keep.

Answer overview

As we speak we’re introducing persistent buffering for Amazon OpenSearch Ingestion to reinforce information sturdiness and simplify information ingestion architectures for Amazon OpenSearch Service prospects. You should utilize persistent buffering to ingest information for all push-based sources supported by Amazon OpenSearch Ingestion with out the necessity to arrange a standalone buffer. These embody HTTP sources and OTEL sources for logs, traces and metrics. Persistent buffering in Amazon OpenSearch Ingestion is serverless and scales elastically to satisfy the throughput wants of even probably the most demanding workloads. Now you can focus in your core enterprise logic when ingesting information at scale in Amazon OpenSearch Service with out worrying in regards to the undifferentiated heavy lifting of provisioning and managing servers so as to add sturdiness to your ingest pipeline.


Allow persistent buffering

You’ll be able to activate the persistent buffering for current or new pipelines utilizing the AWS Administration Console, AWS Command Line Interface (AWS CLI), or AWS SDK. For those who select to not allow persistent buffering, then the pipelines proceed to make use of an in-memory buffer.

By default, persistent information is encrypted at relaxation with a key that AWS owns and manages for you. You’ll be able to optionally select your individual buyer managed key (KMS key) to encrypt information by deciding on the checkbox labeled Customise encryption settings and deciding on Select a unique AWS KMS key. Please observe that if you happen to select a unique KMS key, your pipeline wants extra permission to decrypt and generate information keys. The next snippet exhibits an instance AWS Identification and Entry Administration (AWS IAM) permission coverage that must be hooked up to a task utilized by the pipeline.

    "Model": "2012-10-17",
    "Assertion": [
            "Sid": "KeyAccess",
            "Effect": "Allow",
            "Action": [
            "Useful resource": "arn:aws:kms:us-west-2:111122223333:key/1234abcd-12ab-34cd-56ef-1234567890ab"

Provision for persistent buffering

As soon as persistent buffering is enabled, information is retained within the buffer for 72 hours. Amazon OpenSearch Ingestion retains monitor of the info written right into a sink and routinely resumes writing from the final profitable examine level ought to there be an outage within the sink or different points that forestalls information from being efficiently written. There aren’t any extra companies or elements wanted for persistent buffers apart from minimal and most OpenSearch compute Items (OCU) set for the pipeline. When persistent buffering is turned on, every Ingestion-OCU is now able to offering persistent buffering together with its current means to ingest, rework, and route information. Amazon OpenSearch Ingestion dynamically allocates the buffer from the minimal and most allocation of OCUs that you simply outline for the pipelines.

The variety of Ingestion-OCUs used for persistent buffering is dynamically calculated based mostly on the supply, the transformations on the streaming information, and the sink that the info is written to. As a result of a portion of the Ingestion-OCUs now applies to persistent buffering, with a purpose to keep the identical ingestion throughput in your pipeline, it’s good to improve the minimal and most Ingestion-OCUs when turning on persistent buffering. This quantity of OCUs that you simply want with persistent buffering depends upon the supply that you’re ingesting information from and likewise on the kind of processing that you’re acting on the info. The next desk exhibits the variety of OCUs that you simply want with persistent buffering with totally different sources and processors.

Sources and processors Ingestion-OCUs with buffering In comparison with variety of OCUs with out persistent buffering wanted to realize related information throughput
HTTP with no processors 3 instances
HTTP with Grok 2 instances
OTel Logs 2 instances
OTel Hint 2 instances
OTel Metrics 2 instances

You’ve got full management over the way you need to arrange OCUs in your pipelines and resolve between growing OCUs for increased throughput or decreasing OCUs for price management at a decrease throughput. Additionally, whenever you activate persistent buffering, the minimal OCUs for a pipeline go up from one to 2.

Availability and pricing

Persistent buffering is out there within the all of the AWS Areas the place Amazon OpenSearch Ingestion is out there as of November 17 2023. These contains US East (Ohio), US East (N. Virginia), US West (Oregon), US West (N. California), Europe (Eire), Europe (London), Europe (Frankfurt), Asia Pacific (Tokyo), Asia Pacific (Sydney), Asia Pacific (Singapore), Asia Pacific (Mumbai), Asia Pacific (Seoul), and Canada (Central).

Ingestion-OCUs stays on the similar worth of $0.24 cents per hour. OCUs are billed on an hourly foundation with per-minute granularity. You’ll be able to management the prices OCUs incur by configuring most OCUs {that a} pipeline is allowed to scale.


On this put up, we confirmed you the way to configure persistent buffering for Amazon OpenSearch Ingestion to reinforce information sturdiness, and simplify information ingestion structure for Amazon OpenSearch Service. Please consult with the documentation to be taught different capabilities supplied by Amazon OpenSearch Ingestion to a construct refined structure in your ingestion wants.

Concerning the Authors

Muthu Pitchaimani is a Search Specialist with Amazon OpenSearch Service. He builds large-scale search purposes and options. Muthu is within the matters of networking and safety, and is predicated out of Austin, Texas.

Arjun Nambiar is a Product Supervisor with Amazon OpenSearch Service. He focusses on ingestion applied sciences that allow ingesting information from all kinds of sources into Amazon OpenSearch Service at scale. Arjun is eager about massive scale distributed methods and cloud-native applied sciences and is predicated out of Seattle, Washington.

Jay is Buyer Success Engineering chief for OpenSearch service. He focusses on total buyer expertise with the OpenSearch. Jay is eager about massive scale OpenSearch adoption, distributed information retailer and is predicated out of Northern Virginia.

Wealthy Giuli is a Principal Options Architect at Amazon Net Service (AWS). He works inside a specialised group serving to ISVs speed up adoption of cloud companies. Outdoors of labor Wealthy enjoys working and enjoying guitar.



Please enter your comment!
Please enter your name here