Is it correct to say that the benefits of using HBase instead of just using HDFS are: 1) Faster writes and reads because of the in-memory LSM tree. 2) Column oriented storage (inside ss tables). Which makes it better to run analytics jobs (MapReduce, Spark etc.) over HBase instead of HDFS. That's it, right?
@jordanhasnolife516315 күн бұрын
That sounds about right to me. HDFS also isn't a database, you literally just write and retrieve files. You need some layer on top of that to provide a database abstraction.
@NIVO10002 жыл бұрын
It's still not clear to me why HBase has faster reads than Cassandra. The are both column style so doesn't that generally mean Cassandra would have a comparable read time?
@jordanhasnolife51632 жыл бұрын
You're mixinh up the format of the table versus the underlying storage of the data in files on disk. Cassandra uses row oriented storage, meaning that there is greater locality of data within the same row. HBase uses column oriented storage, which means that there is greater locality of data within the same column. Hence for full table scans over a single column, Hbase should be better, while Cassandra should be faster for reading isolated rows.
@NIVO10002 жыл бұрын
I see so “column family” and “column storage” are two different things. I think that’s where I was getting confused
@jordanhasnolife51632 жыл бұрын
@@NIVO1000 yup
@user-jt5nd3yq4u10 ай бұрын
@@jordanhasnolife5163didn’t you start your Cassandra video by saying it is a wide column database? I thought it is also a columnar database thus same read performance as Hbase?
@jordanhasnolife516310 ай бұрын
@@user-jt5nd3yq4u Wide column is just a way of saying here's how we'll display the data back to a user. On disk, it can actually be represented in a variety of different ways. Whereas Cassandra uses row oriented storage on disk, H Base uses column oriented storage.
@DipalPatel-ds4gn8 ай бұрын
TIFU by pavloving myself to sleeping to your videos. 😂😂
@jordanhasnolife51638 ай бұрын
I'm sorry in advance
@raj_kundalia Жыл бұрын
thank you!
@utkarshgupta29092 жыл бұрын
Is it right to say 1) HBase supports "distributed atomic transactions" as it has strong consistency coming from HDFS replication pipeline. 2) Hbase does not support local transactions [Isolation] as region node has Memtable without any locking
@jordanhasnolife51632 жыл бұрын
1) I don't think so as that would imply that you could do multiple different writes and have them all either go through or fail together. It seems that just the SSTables are replicated, so I'm not sure that memtable writes are even strongly consistent. 2) I think that due to the writeahead log you should only be able to write one thing to the memtable at a time, but truthfully I'm not too sure.