OpenSearch icon indicating copy to clipboard operation
OpenSearch copied to clipboard

Storage Roadmap

Open andrross opened this issue 2 years ago • 16 comments

This is a more specific plan expanding on the ideas introduced in the High Level Vision for Storage issue. The goal of this plan is to have a place to discuss the bigger picture as we design and implement the incremental features along the way.

Phase 1: Add Remote Storage Options for Improved Durability Users can enable continuous backup of translog documents and Lucene segments to remote storage, guaranteeing durability without replicas or periodic snapshots. No changes to search.

Phase 2: Searchable Snapshots Users can search snapshots in remote repositories without downloading all index data to disk ahead of time.

This phase implements the ability for an OpenSearch node to search indexes that are stored in remote storage as snapshots. It will leverage what Ultrawarm has built everywhere that is possible, but will be built natively into OpenSearch. It will fetch index data on-demand during searches and use a disk-based cache to improve performance. It will not assume that indexes are immutable (in preparation for Phase 4) or that shards have been merged to a single Lucene segment (in preparation for Phase 3).

This is intended as an incremental feature that can add value on the way towards implemented the more long term goals in phases 3 and 4.

Phase 3: Searchable Remote Index Users can search remote data from Phase 1 indexes without requiring all index data to be stored on disk.

This phase will allow users to remove index data from instance storage for remote storage-enabled indexes from Phase 1 while retaining the ability to search these indexes via the remote searching feature built in Phase 2. Until Phase 4 is implemented these indexes will be read-only. The work required here is to implement the functionality to migrate an index from a Phase 1 remote-enabled index to a (read only) searchable remote-only index, as well as adapting the snapshot-based functionality from Phase 2 to work with these types of indexes.

Phase 4: Writable Searchable Remote Index Users can write to searchable remote indexes created in Phase 3, without requiring index data to be stored on instance storage.

This phase requires extending the functionality built in Phase 1 to write to remote-only indexes as opposed to mirroring data on disk and in the remote store.

Phase 5: Cold Remote Index Users can create indexes where both index data and metadata are stored in remote storage. Indexes are searchable and writable, possibly only via an asynchronous API due to the amount of time needed to process requests.

This phase requires designing and building a new concept of creating index readers and writers on-demand from metadata and data located in a remote store.

Additional Work Streams

The following are additional features that are related to/enabled by remote storage.

Segment Replication w/ Remote Storage Segment replication is a feature under development to replicate Lucene segments from the primary node to replicas (as opposed to replicating the original document, which requires replicas to do the indexing as well). If the primary node is also replicating Lucene segments to remote storage (Phase 1) then replicas can pull that data from remote storage instead of copying from the primary. This architecture leverages the remote store for fanout during replication, eliminating a bottleneck on the primary when a large number of replicas exist.

Point in Time Restore Phase 1 enables continuous backup to remote storage, which means it is in theory possible to implement a feature to restore to nearly any point-in-time that was backed up along the way.

andrross avatar Jun 29 '22 16:06 andrross

I‘m so happy to see this Roadmap. I have been bothered for alternative Searchable Snapshots recently. 💯

zehonghuang avatar Jun 30 '22 15:06 zehonghuang

Hi, "Ultrawarm has built everywhere", what's its meaning? Will searchable snapshot only run on AWS?

zehonghuang avatar Jul 01 '22 04:07 zehonghuang

Hi, "Ultrawarm has built everywhere", what's its meaning?

Much of the implementation of Ultrawarm will be applicable here, so that code will be ported over where appropriate.

Will searchable snapshot only run on AWS?

Definitely not. We intend to build this on top of the existing repository interface, so any object store for which there is a repository implementation will be supported (AWS, Azure, GCP, HDFS, etc).

andrross avatar Jul 01 '22 19:07 andrross

It's beneficial to reduce cost of storage. I will keep following this function. Thank you for replying to my question.

zehonghuang avatar Jul 02 '22 07:07 zehonghuang

Quick update on the progress here:

Phase 1 has been released as an experimental feature in OpenSearch 2.3! We welcome any and all feedback as we work towards finalizing this new capability.

Phase 2 is in active development and we're targeting OpenSearch 2.4 to have the first experimental version of this feature.

andrross avatar Sep 20 '22 00:09 andrross

Hi, as primary shard may write index data/translog to remote store. If there is a brain split, old primary still accepts bulk requests from client. Files on remote storage may be written by multiple primary, will this situation happen? If so, is there any proposal to solve such a problem?

jrj0823 avatar Jun 06 '23 09:06 jrj0823

hello, are there any plans to support searchable snapshots in ISM?

bugmakerrrrrr avatar Jun 06 '23 09:06 bugmakerrrrrr

Hi, as primary shard may write index data/translog to remote store. If there is a brain split, old primary still accepts bulk requests from client. Files on remote storage may be written by multiple primary, will this situation happen? If so, is there any proposal to solve such a problem?

This wouldn't happen today since we ensure we talk to the replica copy to detect network partitions and isolated writers before writing to the remote store. You can read more on this #3706

Bukhtawar avatar Jun 06 '23 14:06 Bukhtawar

hello, are there any plans to support searchable snapshots in ISM?

Hey @bugmakerrrrrr . Thanks for your interest in searchable snapshots. I have added in a feature request to the index management team here - https://github.com/opensearch-project/index-management/issues/808 Please track the issue for any updates and feel free to add any other requirements on the issue.

kotwanikunal avatar Jun 06 '23 18:06 kotwanikunal

I'm so excited and happy about this Storage Roadmap 👍 .

I read the Searchable Snapshot documentation section, and there is one question that is not to be clear to me:

Is it possible the Search nodes exceed the disk usage with data queried from remote storage of the snapshots index? What is behavior in this case? Will the cache be rotated based On Demand queries?

carlos-neto-trustly avatar Jun 27 '23 20:06 carlos-neto-trustly

Looks like this is where the tech discussion is taking place, so I am curious to know if there's any open source material (not code, just information) that we can read up on how Ultrawarm and Cold tier works. I just want to understand how it works to better size clusters. For example, are ultrawarm searches distributed across ultrawarm nodes? Will adding more nodes speed up the cold -> ultrawarm migration and ultrawarm searches? Please direct me to a more suitable place to ask this question if there is one, thanks!

marsupialtail avatar Oct 24 '23 03:10 marsupialtail

@marsupialtail UltraWarm and Cold Tier are current AWS offerings, so AWS is the place to go for information about those features. This issue and the overall plan is about the features being built on top of the new remote store-based architecture.

andrross avatar Nov 09 '23 22:11 andrross

I'm so excited and happy about this Storage Roadmap 👍 .

I read the Searchable Snapshot documentation section, and there is one question that is not to be clear to me:

Is it possible the Search nodes exceed the disk usage with data queried from remote storage of the snapshots index? What is behavior in this case? Will the cache be rotated based On Demand queries?

Is there any new information on this?

yusizn avatar Dec 14 '23 08:12 yusizn

I'm so excited and happy about this Storage Roadmap 👍 . I read the Searchable Snapshot documentation section, and there is one question that is not to be clear to me: Is it possible the Search nodes exceed the disk usage with data queried from remote storage of the snapshots index? What is behavior in this case? Will the cache be rotated based On Demand queries?

Is there any new information on this?

Yes, data is offloaded from local disks in that case, with the least recently used data being evicted first.

andrross avatar Dec 15 '23 04:12 andrross

Hello, I'm very interested in this roadmap can i know the release schedule of Phase 3, 4, 5??

10000-ki avatar Feb 13 '24 08:02 10000-ki

@10000-ki We are working through the planning and estimating now and will keep this issue up to date when we have better visibility into the schedule.

andrross avatar Feb 15 '24 22:02 andrross