As the use of computers and technology has increased rapidly in the past few decades so the amount of data associated with it, and so the need to store that data efficiently.
Storage as explained by Oxford “The action or method of storing something for future use.” which clearly explains the need of storage and storage devices. With today’s complex deployments the days of simply saving files to disk on a single server are gone. Storage need of modern data broadly fits into two categories: Object Storage and Block Storage. Let’s compare these two.
Block Storage
It is a method of storing data in a fixed size chunks called blocks. It provides a traditional block storage device – like a hard drive – over the network. Address is the only identifying part of the block and there is no metadata.
In Block Storage, a block has only a portion of data stored, though it can be resized to accommodate growing needs up to a limit only. It follows strict consistency as it has better IO speed and is better for frequent modifications. It is mainly for random read/write operations. It leads to faster performance only when application and storage are local.
With block storage, file distribution becomes complex, even across multiple servers. This results in inefficient utilization of resources. It is used for database and transactional data. Amazon EBS is one such example of block storage.
Object Storage
As the name implies is the way to organize and work with unit of storage, called objects or say, unstructured blobs of data and metadata. Every object in an object storage contains mainly three parts:
- Unique Identifier – It is the address of the object used to find object in a distributed system without actually knowing the physical location of that object.
- Metadata – It is the information about the actual data in some contextual form.
- And the data itself – Anything, from a simple file to an image or video.
In object storage, the entire cluster of data is stored into single object no matter what type or amount of data it is. Data stored is highly durable, available, supports resiliency and eventual consistency. It even works well for data sets where data is mostly read. It’s more of a write once, read many times use case. Using an object storage eliminates the need to maintain hard drives, as that is handled by the service provider.
However, object storage generally doesn’t provide us with the ability to incrementally edit one part of a file. Objects have to be manipulated as a whole unit, requiring the entire object to be accessed, updated, and then re-written in their entirety.
Amazon S3 is one of the examples of Object storage.
So, which storage method is better Block Storage or Object Storage? Based on the above discussion we can conclude that for applications that require huge amount of unstructured data Object Storage is the best option, whereas, for databases and file IO purposes Block Storage is the best option.
Do comment and share your views!
Leave a Reply