Skip to main content

Applied AI Series - RAG Better Results with Re-ranking

· 11 min read
Niko
Software Engineer @ Naver

Re-ranking in Retrieval-Augmented Generation (RAG) refines the documents retrieved in response to a user query, ensuring that only the most relevant and contextually appropriate ones are passed to the generation model. This step enhances response accuracy, handles ambiguity, and improves overall result quality by prioritizing the best matches for the query, ultimately leading to more precise and coherent AI-generated answers.

Applied AI Series - RAG Speedup LLMs with Document chunking

· 9 min read
Niko
Software Engineer @ Naver

Document chunking is crucial for optimizing Retrieval-Augmented Generation (RAG) systems by breaking large documents into smaller, manageable pieces, which significantly speeds up retrieval and enhances the relevance of results. In RAG, where information retrieval is followed by text generation, chunking allows the system to search and process only the most relevant sections of content, improving both efficiency and accuracy. This approach ensures faster retrieval times, better handling of long-form documents, and more precise generation by focusing on contextually meaningful chunks rather than entire documents, ultimately enhancing the overall performance of LLMs in real-time applications.

Applied AI Series - What is RAG?

· 5 min read
Niko
Software Engineer @ Naver

Welcome to the latest installment of our Applied AI Series, where we delve into cutting-edge technologies and explore how they are transforming industries and everyday tasks. In this edition, we’ll unravel the intricacies of RAG (Retrieve and Generate), a powerful framework that’s making waves in the field of artificial intelligence. Whether you're a tech enthusiast, a researcher, or someone curious about AI, this post will provide a clear understanding of RAG and its applications.

From Jekyll to Docusaurus, I Migrated My Blog After 5 Years

· 4 min read
Niko
Software Engineer @ Naver

Reflecting on 5 Years with Jekyll: Why I Migrated My Engineering Blog to Docusaurus

After five years of running my engineering blog on Jekyll, I’ve decided to migrate to Docusaurus. This decision came after extensive evaluation of my blogging needs and the capabilities of various platforms. In this post, I’ll share the reasons behind the migration, the benefits I’ve experienced, and the improvements I’ve seen with Docusaurus.

Javascript series - Understanding array Map and Reduce

· 3 min read
Niko
Software Engineer @ Naver

JavaScript provides a number of built-in array methods that can be used to manipulate arrays in a variety of ways. Two of the most powerful and commonly used methods are map and reduce. In this section, we will explore what these methods do, and how to use them to perform complex operations on arrays of data in JavaScript.

Microservice monitoring - Application Logs

· 10 min read
Niko
Software Engineer @ Naver

Correlate Requests With a Unique ID

Think back to the request calling chain between Services A, B, and C that I talked about in the previous section. Following from this idea, it's a good practice to tag each call with a unique ID that identifies the request.

For example, let's say you're logging access and error logs for each service. If you find an error in Service B, it might be useful for you to know whether the error was caused by the request coming from Service A or the request sent to Service C.

Maybe the error is informative enough that you don't even need to reproduce it. But if that isn't the case, the correct way to reproduce the error is to know all possible requests in the services that are related to, say, Service B. When you have a correlation request ID, then you only need to look for that ID in the logs. And you'll get all logs from services that were part of the main request to the system. You'll also know which service the main request spends the most time in, if the service is using the cache, if a service is calling other services more than once, and a lot of other interesting details.

Send Logs to a Centralized Location

Let's assume that you're already adding all sorts of useful information to your logs. But it's essential to send logs to a centralized location.

Think about it. If you have to log in to each individual server to read logs, you'll spend more time trying to correlate problems than if you just had one location where you could access all logs. Also, systems usually get more complicated as time goes by, so the amount of microservices usually grows too. And to make things even more complicated, services could be hosted on different servers or providers.

Centralized logging is becoming the norm, especially if you're working with cloud, container, or hybrid environments because the servers could be terminated without any notice. For example, containers are terminated if there's an unexpected error, or the memory reaches 100 percent of its consumption capacity.

You could solve this problem by having agents pulling and pushing logs every five minutes or before servers get terminated. You could also configure a cronjob in the server, a sidecar container, or a shared file location where another process could centralize logs. Avoid the temptation of building a solution stack by yourself, aswhere to centralize logging is a well-known problem that's already solved.

Having logs from all of your services in one place makes it easy and efficient to correlate problems.

Make sure log aggregation is truly centralized

It’s not enough to aggregate some log data into one place (such as a public cloud vendor’s log manager, like CloudWatch) and aggregate other data somewhere else (like a third-party log management tool). Although this approach may seem like the way to go if some microservices run in one location and others run somewhere else, you need all of your log data in a single location if you want to analyze and store it effectively.

Structure Your Log Data

It's going to be almost impossible to have a defined format for log data; some logs might need more fields than others, and those that don't need all those excess fields will be wasting bytes. Microservice architecture addresses this issue by using different technology stacks, which impacts the log format of each service. One service might use a comma to separate fields, while others use pipes or spaces.

All of this can get pretty complicated. Make the process of parsing logs simpler by structuring your log data into a standard format like JavaScript Object Notation (JSON). JSON allows you to have multiple levels for your data so that, when necessary, you can get more semantic info in a single log event.

Parsing is also more straightforward than dealing with particular log formats. With structured data, the format of logs is standard, even though logs could have different fields. You'll also be able to create searches in the centralized location like looking for logs that have an "http_code" of 500 and above. Use structured logging to have a standard but flexible format in your microservices logs.

Add Context to Every Request

I don't know about you, but when something goes wrong, I want to know everything! Having enough information about an issue provides you with important context for the request. Knowing what could have caused the problem is essential, and having the right context will help you to understand what's happening more quickly. But adding context to logs could become a repetitive task in code because there's going to be common data like date and time that you'll need in each log event. So in code, logging will look simpler because you'll only be logging the message and other unique areas.

You might want to log all the data you can get. But let me give you some specific fields that could help you figure out what you really need to log.

  • Date and time. It doesn't have to be UTC as long as the timezone is the same for everyone that needs to look at the logs.
  • Stack errors. You could pass the exception object as a parameter to the logging library.
  • The service name or code, so that you can differentiate which logs are from which microservice.
  • The function, class, or file name where the error occurred so that you don't have to guess where the problem is.
  • External service interaction names-you'll know which call to the DB was the one with the problem.
  • The IP address of the server and client request. This information will make it easy to spot an unhealthy server or identify a DDoS attack.
  • User-agent of the application so that you know which browsers or users are having issues.
  • HTTP code to get more semantics of the error. These codes will be useful to create alerts.
  • Contextualizing the request with logs will save you time when you need to troubleshoot problems in the system.

Collect Logs to Persistent Storage

As noted above, microservices often run inside infrastructure—like a container—that lacks persistent storage. A basic and essential best practice in that case is to ensure that log data is written to somewhere where it will be stored persistently and remain available if the container shuts down.

You could do this by modifying source code or your container configurations to ensure that the logs are written to an external storage volume. An easier approach, however, is to run a logging agent that will collect data from the containerized microservice in real time and aggregate it within a reliable storage location.

This way, you kill two birds with one stone (or logging agent, as it were): You aggregate logs and move log data to persistent storage, all in one step.

Log Useful and Meaningful Data to Avoid Regret

These are just some ordinary things that help you log microservices. If you're just starting out with logging, these practices might not make much sense and seem useless. But once you've been using microservices for a while, they'll save you a lot of trouble. Just make sure you're always evaluating what you're logging. You'll have a better idea of what's important to log when you're troubleshooting and say to yourself "I wish I had X and Y information so I could spot these weird errors more easily."

Once you're logging enough data, it's time to automate things like alerts. You shouldn't have to spend a ton of time reading logs to spot problems. Automating alerts will also help you to be proactive and keep errors from spreading to all of your users.

Having a centralized logging location to make further analysis is a must in microservices. Adding enough context to logs will make the difference in identifying what log data is useful, and what data is useless.

Logs aggregations

Use custom parsing

Because microservices logs are often structured in multiple ways, trying to search through all of your data using generic regexes is typically not very effective. Instead, consider writing custom parsing rules that govern how your log analytics tool identifies relevant trends within log data, even if you are working with logs of varying types or structures.

Make Logs Searchable

Make sure all the fields available in the logs are searchable. For example, If you get ahold of the correlation ID, you can search all the logs based on that t to find out the request flow.

elk-logs.jpg

The amount of logs collected is usually very large. Therefore, we need to optimize the storage and especially ensure query performance. To do that we need to indexing logs data before putting it into durable storage. For example, we use ELK to collect and process logs. We will index on Elasticsearch.

Visualize log data

When working in microservices-based applications, developers often need to examine the state of an application, as well as information on any issues, downtimes or latency. Teams can add dashboard features that provide a visual depiction of the information carried into the application logs.

Log visualization illustrates aggregated information for the services. This information may include, but is not limited to, requests, responses and size of responses. There are various log visualization tools available, such as Scalyr, Graylog, Kibana and Sematext.

Data partitioning to optimize storage cost

data-partition-storage.png

With a huge amount of data from logs, the problem of optimizing storage costs is a big problem. According to the design pattern for big data storage, we will partition the data by query terms, including:

  • Hot data: frequently queried data is usually new logs or realtime logs.
  • Warm data: are data types that are not frequently queried. Middle term query. Usually historical data will be used to access when needed.
  • Cold data: are data types that are almost not used for querying, but for the purpose of archiving and long-term storage.

Because the cost for different types of hosting services is also different according to the query plan. So we need to set up plans to transfer logs data right for the purpose of use to optimize costs. For example, for Storage services for Hot data, the cost of the bandwitch will be quite low and the cost of storage and access speed will be quite high. In contrast to Storage services for Cold data, the cost for storage is quite low, but the cost of access will be more expensive and of course the access speed will be lower. The cost difference between Storage services for data types depends on the Query plan that needs corresponding hardware configuration.

Read more:

References:

Microservice monitoring - Log centralization

· 6 min read
Niko
Software Engineer @ Naver

Problems

We all know importance of logging in applications but it is even more crucial in distributed systems. There are challenges of logging in microservices architectures.

As the amount of services in a microservice architecture rises, complexity naturally also rises. Bugs and individual service failures can be very tricky to deal with and having to spend hours digging deep to find the root of an unnamed error can be a daunting and unproductive task.And they get even more complicated when one or more services fail.

  • Which service failed?
  • Why?
  • And under what circumstances?

To effectively deal with the challenges of system errors, request chain breakdowns, or even simply to stay on top of your system architecture, logging is a vital tool. It is undeniable that it is the cornerstone of maintaining and debugging your app efficiently. All these questions are hard to answer if you don’t have good, meaningful logs.

Why we need a Centralized Log Management (CLM) for microservice architecture?

microservice architecture

In the microservice pattern, thousands of services operate independently and are distributed across many different infrastructures with each service having different types of logs. Therefore we need a central control of these logs synchronously and centrally manage them. Ensure that when any errors arise, we can trace them and offer a quick and thorough solution.

In order to ease this entire process, many solutions such as a Centralized Log Management (CLM) solution comes into the picture.

Monitoring containerized microservices with a centralized logging architecture.

The CLM Pillars of Observability

Based on usage goals and lifecycle round, CLM divided logs data into 3 main categories:

  • Metrics
  • Logs & Events
  • Traces

three_pillars_observability.jpg

1. Metrics

Metrics probably represents the most valuable of the three monitoring tools because:

So many resources are designed to spit out various bits of health and performance information (and there are loads of tools to collect this information). They are efficient. They are frequently generated. They are easily correlated across elements of your infrastructure. Everything from operating systems to applications generate metrics which, at the least, are going to include a name, a time stamp and a field to represent some value. Since so many resources come ready to tell us about themselves, metrics is an obvious place to start when it comes to monitoring.

Most all metrics will enable you to tell if a resource is alive or dead, but if the target is valuable enough you’ll want to be able to ascertain what is actually wrong with the system or going wrong. As you can imagine, the latter will require detailed information about what is happening inside the system, so called white-box monitoring that relies on internal instrumentation. The more rudimentary black-box approach draws conclusions about the health of a system based on externally visible indicators (is it responding to any system calls?).

But perhaps the most important thing to understand about metrics is that last bullet about being able to correlate metrics across infrastructure components. Given the complex interdependencies common to IT environments today, the ability to stitch together metrics to get a bigger picture view is a real time saver. And it becomes even more critical as we move to cloud-native environments because of the dynamic nature of cloud infrastructure and the ever-changing relationship between that infrastructure and the applications.

An initial challenge of harnessing metrics is the variety of the information available and the number of tools needed to collect and make sense of that information. Then there is the question of how you store data in so many formats from so many resources. But the resultant upside more than makes up for the effort required to figure out how to harness the information.

It is also worth noting that, given there is no standard API for metric collection, many organizations use agents to collect data that is either pushed to a central location for analysis or pulled by that central resource. Gartner says agents frequently referenced by customers include push agents Collectd and Telegraf, while Prometheus is cited as a tool to pull information from targets.

2. Logs & Events

Logs are perhaps the second most important tool in the monitoring toolbox because virtually everything logs information about what they are doing at any given time. What’s more, logs tend to give more in-depth information about resources than metrics. So, if metrics showed the resource is dead, logs will help tell you why it died.

The problem with logs is there can be too much of a good thing. With everything in your environment tracking what they are doing and anxious to share that information, it is easy to see how that could result in a mountain of data. Instead of streamlining the monitoring process, you are simply creating a big new centralized haystack.

And like metrics, differences in log formats and the abundance of tools available to collect and make sense of logs, complicates the job of getting the most out of this rich trove of material. There are, however, a number of common tools used for collecting logs, such as syslog and open source tools such as Fluentd.

The trick to getting the most out of these tools is limiting what you collect to keep it manageable, and, where possible, to home in on common fields so you can more easily find the needles in the haystack.

3. Traces

Last but not least in the monitoring triangle is application trace data, which “traces” information about specific application operations. With so many application interdependencies these days, these operations will typically involve hops through multiple services (so called spans).

Traces, then, add critical visibility into the health of an application end-to-end. They are, however, solely focused on the application layer and provide limited visibility into the health of the underlying infrastructure. So, even if you collect traces, you still need metrics to get the full story of your environment. APM tools feed trace information to a centralized metrics store, so traces provide a nice source of data for an app-centric view.

The need for the viewpoint that traces can provide is exacerbated in container-based microservice architectures that are nothing more than a collection of stitched together services. These environments can be addressed with something called distributed tracing.

References: