What dataset type should I pick?

Long-term datasets are what you would think of as "persistent": they stick around for as long as they are useful; i.e., they expire only after having been unused for a long time rather than at a fixed date, and then are locked down rather than automatically destroyed. Long-term datasets require administrator approval and thus need to be justified. Expect that we will send email asking you about it.

Short-term datasets are for situations where you want to run a series of experiments over a short time period (days to weeks) using the same large data set but where local storage may not be practical because there is too much data to fit in a disk partition.

Image-Backed datasets are the simplest kind of dataset; they are loaded onto your node into a spare partition or volume. In fact, an image backed dataset is an instance of an ephemeral blockstore that is preloaded with the data contained in the image file. This saves you the trouble of having to copy your data to the local disk, either from across the public internet or from one of the blockstore types above.

More about Datasets