Saturday 8 October 2011

In Memory Database - an Eye Opener

We have seen some thing about SAP HANA, Now we will also see some thing more about the In Memory Database. 

An in-memory database (IMDB) is a database management system that primarily relies on main memory for computer data storage. It is contrasted with database management systems which employ a disk storage mechanism. Main memory databases are faster than disk-optimized databases since the internal optimization algorithms are simpler and execute fewer CPU instructions. Accessing data in memory reduces the I/O reading activity when querying the data which provides faster and more predictable performance than disk. In applications where response time is critical, such as telecommunications network equipment and mobile ads networks, main memory databases are often used.

The in-memory approach involves placing the entire database within a server's working memory, rather than storing it on disk. This cuts the time it takes to write data to, or read data from, the disk, which typically can take two to four milliseconds. An in-memory write or read can take less than a millisecond. Keeping all your tables in memory means you literally get a 3 orders of magnitude speedup (1000x), and you can use simple generic indexing strategies so the code becomes really simple. 

Today's largest transactional systems, such as those found in the financial community, can do as many as 300,000 to 400,000 transactions per second, and that number is expected to balloon to over a million per second in the years to come. In many cases, the database disk writes and reads are the performance bottlenecks to such systems.

In-memory databases can also be used by government and corporate intelligence organizations, to quickly analyze terabytes of information.

Why In Memory Database will be faster that Disk based database?
Data-Transfer in Disk Based DBMS:
Consider the hand offs required for an application to read a piece of data from a traditional disk-based database, modify it and write that piece of data back to the database. The process is illustrated in Figure 1.
  1. The application requests the data item from the database run-time through the database API.
  2. The database run-time instructs the file-system to retrieve the data from the physical media.
  3. The file-system makes a copy of the data for its cache and passes another copy to the database.
  4. The database keeps one copy in its cache and passes another copy to the application.
  5. The application modifies its copy and passes it back to the database through the database API.
  6. The database run-time copies the modified data item back to database cache.
  7. The copy in the database cache is eventually written to the file-system, where it is updated in the file-system cache.
  8. Finally, the data is written back to the physical media.


These steps cannot be turned off in a traditional database, even when processing takes place entirely within memory. And this simplified scenario doesn't account for the additional copies and transfers required for transaction logging!

In contrast, an in-memory database system entails little or no data transfer. The application may make copies of the data in local program variables, but it is not required. Instead, the IMDS gives the application a pointer that refers directly to the data item in the database, enabling the application to work with the data directly. The data is still protected because the pointer is used only through the database API, which insures that it is used properly. Elimination of multiple data transfers streamlines processing. Cutting multiple data copies reduces memory consumption, and the simplicity of this design makes for greater reliability.

In recent years, main memory databases have attracted the interest of larger database vendors. TimesTen, a start-up company founded by Marie-Anne in 1996 as a spin-off from Hewlett-Packard, was acquired by Oracle Corporation in 2005. Oracle now markets this product as both a standalone database and an in-memory database cache to the Oracle database. IBM acquired SolidDB in 2008, and Microsoft is widely rumored to be launching an in-memory solution in 2009.SAP announced general availability of SAP HANA in June 2011.

No comments:

Post a Comment