COST REDUCTION STORAGE ATTACHMENTS DURING MIGRATION TO S/4 HANA
In this blog we will look at how data management is affected by database choice and in particular what consequences this has for attachments in SAP. One of the issues that often remains underexposed is the way attachments and generated documents are stored in SAP. It is not uncommon for attachment storage to take up a significant portion of the database. Read also our guides for your SAP data migration and for SAP Hana migration.
For a general understanding, find out what enterprise content management is and specifically in the SAP environment.
Many SAP users link local PC files to SAP documents. Via the "Generic object services" button, available in almost all standard SAP transactions, for example PDF, Word or Excel files can be made accessible via the underlying SAP document. In practice, for example, quotation files are linked to purchase orders or incoming purchase orders are linked to sales orders. In addition, many documents, especially PDF documents, are generated by SAP. Think for example of outgoing purchase orders. All these documents end up in the SAP database.
In this blog, we want to make clear that storage of these attachments is more efficient and much cheaper by moving the storage of the database to an external content repository. Especially when a HANA database is introduced in the SAP landscape, this is from an architectural point of view more of a requirement than an option. We will also briefly discuss how this move from database to un external content repository.
Research among some of our customers shows that in some cases the table where attachments and generated documents are stored (table SOFFCONT1) can take up to 35% of the total database. When looking at the cost of storage in an SAP HANA environment, a reduction of the database can realize significant cost savings. But also when traditional databases are used the benefits can be significant. Think of backup procedures or system copies.
Due to the continuous developments that the ICT sector has experienced in recent decades, not only have the possibilities in terms of IT applications increased exponentially, but also the possibilities for storing data. As a result, companies now have access to large amounts of data and more and more opportunities to analyze and process this data. A frequently used term is "Data is the new currency" . That data is valuable, seems to be a foregone conclusion to everyone. In order to optimize and accelerate business processes, not only the data itself but also rapid access to and processing of the same data is increasingly a requirement. Real-time data analysis is high on the priority list of most managers, but this is often obstructed by the current hardware and software within companies.
The limiting factor for fast and real-time data processing are the so-called traditional databases. A traditional database can only handle one work process at a time for a specific application. For each application that is developed, the data from the database is configured and optimized for that application. To do this, data is continuously moved and duplicated to meet the specific needs of each application.
The more applications are used within a company, the harder it becomes to give each application quick access to the data on the database. A traditional solution for this is a data warehouse. The data is aggregated and consolidated in such a way that applications can quickly report on it. However, this has the consequence that the data is no longer accessible in real-time and that compromises are made with regard to the level of detail of the data.
To overcome the shortcomings of traditional databases, SAP has developed a so-called in-memory database where data is directly accessible, the SAP HANA database. In addition to quick access to the in-memory data, there is also a faster processing of, for example, queries on the data. Data does not have to be duplicated in order to be processed; this can take place directly on the database. Among other things, the performance optimization that is achieved with the introduction of a HANA database, removes the need to consolidate and aggregate data towards for example data warehouses. Data is directly and quickly accessible without compromising the level of detail of the data.
The SAP HANA database has been available in the IT landscape for some time (since 2010) and can therefore be used with for example a SAP ECC system. However, where a SAP HANA database is optional for a SAP ECC system, it is a requirement for an S/4HANA system. More and more companies are therefore switching to a SAP HANA database.
Where the introduction of an SAP HANA database brings possibilities for the processing of data, there are also some concerns. Storing data in-memory provides significant benefits in the performance of applications, but is a more expensive form of data storage compared to traditional databases. High performance from a traditional database is achieved through fast storage and processing (CPU) speed of servers. However, in SAP HANA the size of the memory itself (in-memory) becomes the most important resource.
Although storing data is becoming cheaper all the time, this cannot compensate for the continuous growth of the data itself. Furthermore, storing all data in memory is certainly not necessary; why would you make data directly accessible if it is never requested?
Estimates are that on average 85% of the data in a database is so-called "cold data" and is rarely or never "touched". This leaves only 15% "hot data", which is estimated to account for 90% of the interactions with the database. To properly deal with hot and cold data, one can choose to only store part of the data (hot) as in-memory and the other data (cold) on other databases. This cold data is still accessible, but will only be put in-memory when requested by a specific application.
One of the types of data that can be considered "cold" are attachments within a system. Examples include outgoing purchase orders, outgoing invoices, incoming invoices, emails, notes, etc. As an example we take the generated PDF of a purchase order that is sent to a supplier. This PDF attachment is created once, then printed or e-mailed to the supplier and linked to the purchase order in SAP.
When everything is delivered by the supplier and paid for, the purchasing process is basically finished. Many of these created PDF documents will never or rarely be opened. So why load these documents into the memory at all times? It seems like a nonsense and in fact it is!
Traditionally, attachments and generated documents in SAP are stored in the underlying database; this is kind of the default setting in SAP. But even a traditional database is in principle not meant to store so-called binary documents. What we often see with our customers is that this was once an initial setup of the system and unintentionally remains so for years.
The recommended system setup for this is that so-called "flat" data is stored on the database and documents or "binary" data is stored on an external content repository. A content repository has the specific purpose of storing documents and then quickly retrieving them when required. It is also more efficient and cheaper to store these types of documents in a content repository than in a database; think of backup procedures or system copies.
Intellidocx can provide a content repository such as Azure Blob Storage and Microsoft O365.
Where in a SAP landscape with a traditional database it was still a recommended option to store documents on a external content repository, the introduction of S/4HANA and the associated SAP HANA database has made the need for switching from a SAP database to an external Content Repository higher. When migrating to a SAP HANA database with or without an S/4HANA system, our advice is to definitely set up with Intellidocx Azure Blob Storage and/or O365 for the document flow in SAP.
Without a proper installation of an external content repository, all documents and attachments are by default stored on the underlying HANA database; this is unnecessary. If you are migrating to an S/4HANA system, it is our advice to first migrate all documents to Azure Blob Storage and/or O365. This way there is no need for a database migration including all existing attachments; they are already stored in Azure Blob Storage and/or O365.