Archive

OpenStack: What is the optimal number of objects per container

Well, I just had this question and went over to #openstack @ Freenode. There, notmyname (apparently, a moderator or @ps of some kind) answered:

12:20 < notmyname> Renich: I saw you asking questions about swift earlier
12:20 < notmyname> Renich: about object count per container
12:27 < notmyname> Renich: in case you see this later, here's my answer...
12:28 < notmyname> Renich: the recommended number of objects per container in swift depends on two things: (1) how many objects you want to add per second per container and (2) what sort of hardware you have referenced in the contianer rings
12:28 < notmyname> Renich: for the second, I strongly suggest using flash. (SSDs are fine)
12:29 < notmyname> Renich: if you need to sustain eg 100 writes per second in a single container and you've got flash devices for the container storage, then you'll probably be looking at a few dozen million objects
12:30 < notmyname> Renich: but note that this is (1) write performance--reads are unaffected and (2) per container--so shard your data client-side to use lots of containers
12:31 < Renich> notmyname: yes, thanks. Your answer helps a lot. And, yeah, we're using SSDs and testing a ZFS setup actually
12:32 < notmyname> Renich: almost all of any write performance penalty in large container has been eliminated in swift over the last year. but operationally, it's still a good idea to avoid truly huge containers. you don't want to try to replicate billion-row DBs
12:32 < notmyname> Renich: oh, interesting. I'd be interested to hear what kind of performance you get with that. last I saw (but it was a long time ago) ZFS had some pretty bad performance numbers when you get it reasonably full (lots of inodes)
12:34 < notmyname> Renich: also feel free to drop by #openstack-swift if you've got further questions
12:34 < Renich> notmyname: well, we're trying it out in a very specific use case. One putter, and a lot of getters on the setup. For climatic data
12:34 < Renich> notmyname: sure thing, thanks

So, in conclusion:

  • swift doesn't suffer from a high number of objects in a container anymore.
  • Still, you want to keep it under a couple of dozen million per container.
  • You want to use SSD for everything.