Keys to documenting ETL processes

We can document ETL processes, In other words, those that are carried out with an extraction tool, transformation and loading (ETL) after its design and implementation or simultaneously, either by opting for the native tool, since each project is self-documentary, or using a different one. .

Be that as it may, When documenting ETL processes, it is essential to reflect the heart of the project, what it means to perform a clear and well-structured report, that will be delivered to the client so that he has proof of the final work carried out, Y, Besides, It will also be very useful to help its maintenance and carry out continuous improvement.

How to document ETL processes?

There is no commonly accepted standard or methodology that we can follow to document the ETL procedure and its logic in practice.. Y, in reality, repeatedly this work is not even done. Despite this, it is necessary to carry it out if we want to have documentation that reflects the development and result of the project, made in an illustrative way, and complementing the same procedure implemented.

Simply, there are a number of key issues that we must pay attention when documenting ETL processes in the best feasible way. These are the following:

  • Flexible methodology. The aforementioned absence of a standard or methodology allows us to finding our own way to document ETL processes. At the same time of the utilities provided by the tool used, there are different methods that can help us visualize the implemented ETL, such as the concept of value chain mapping or VSM (VMapping of alue currents), so that the same visualization can be used as documentation or as part of it, within a more complete report.

  • Metadata, a good help. Considering that the documentation is somehow implicit in the metadata of the ETL itself, we must pay attention that a good implementation makes it possible to visualize the processes at once. A) And, At the end of the project, it should be feasible to graphically visualize the data flow and use it as background information to document the movement of the data., adding an input and specifying the input and output flows in an ETL procedure.

    processes etl

  • Document the heart of the procedure. The documentation reflects the important part, the core of the procedure. Thus, you can not miss an introduction, a review of the requirements, a summary of the business rules applied or information about the tests done to arrive at the result. Keep that in mind the documentation of ETL processes refers to the essential aspects related to topics as multiple as design, origins and destinations, the response or solutions applied or the transformations made, among others. But, at the same time of these aspects, and whether the documentation is done manually or automatically, taking advantage of the native tool or not, This documentation must also serve to contrast the requirements raised at the beginning of the project and the results obtained.

    Finally, Document ETL processes keeping in mind that the purpose of this task should be to create a versatile document. At the same time to be useful for informational purposes and also as a memorandum, the documentation of the ETL processes must achieve a visual representation of them through the schematics and indications essential to help the work of programmers and maintenance personnel, putting the means, at your fingertips. disposal. weather, to carry out a subsequent procedure of continuous improvement as an ultimate goal.

