icu4x icon indicating copy to clipboard operation
icu4x copied to clipboard

Make code more discoverable by moving component crates to the top level

Open robertbastian opened this issue 1 year ago • 4 comments
trafficstars

I've been browsing a lot of Rust crates over the past few days for a personal project, and have found it super valuable if a crate is easy to locate when following the "Repository" link from docs.rs. This is useful for looking at the implementation, checking out dependencies, and for locating examples. Our crates are currently pretty hard to find, for example for icu_list you have to know it's in components/list. My proposal is:

  • Move all components out of components into the top level, into directories that match the crate names. This will make icu be sorted first, followed by all icu_<component> crates
  • Move experimental/components to the top level as icu_experimental
  • Move the remaining crates in experimental into either ffi or utils. The experimental status is not very useful for utils imho, most crates in utils are pre-1.0 and don't fully conform to our standards
  • Potentially move client-facing provider crates (_blob, _fs, _adapters, _datagen) to the top level as well
  • Maybe move our tutorials from docs/tutorials into a top-level examples folder

robertbastian avatar Jan 31 '24 22:01 robertbastian

  • @sffc - dissolving top-level experimental folder is positive and doable. however I find it useful for all crates to be exactly one level down from the top level. We would have three broad categories of crates which I think is fine.
  • @robertbastian - People look inside the icu crate
  • @Manishearth - What we're doing is a fairly common pattern
  • @Manishearth - The provider/components split is a bit confusing and I would be fine merging those.
  • @zbraniecki - My user story is I search for a directory
  • @Manishearth - Or they look in the workspace file
  • @zbraniecki - I think "components" is a clear name. It is in the ICU acronym. There's no better word; "modules" is not better. I still think "components" is the most descriptive word for international components for unicode.
  • @zbraniecki - I'll also push back on the idea that users will always use the metacrate. I see the metacrate as an entry crate, but once people know what they are doing they should use the more modular crates. That is part of the ICU4X value proposition. So I would rather maintain "ICU4X is a collection of 30 crates that have a convenient metacrate" as opposed to "ICU4X is a crate with 30 modules which happen to be published as individual crates"
  • @robertbastian - The "icu" crate is not a component, and "icu_experimental" is not a component. icu_datagen is not a icu_provider_foo crate, why is it in provider. where should they all live?
  • @sffc - We shouldn't entirely dismiss the disruption of making wholesale file moves of our engineering artifacts, both in the short term (merge conflicts, vendoring) and long term (repo history).
  • @Manishearth - This is a case where I would consider the icu4x developer experience more important.
  • @robertbastian - Where does icu_experimental live?
  • @Manishearth - Maybe we could move the metacrates to the top level, although I thnk it could be confusing if we did put the metacrate at the top level.
  • @zbraniecki - (1) For examples, I think this is a philosophical debate; I don't think they should move into the metacrate. I think a developer should come and download icu_locid and be able to run its example. That's unique to ICU4X; they cannot do that with ICU4C. (2) I think this is a good conversation to have. I've never been completely comfortable with all of the components. They capture customer-facing components, like LDML. Components is datetime, plurals, maybe properties. But things that are not customer-facing for me are not components. Those are "internals". I would be more strict with what goes into components. For example, maybe we have "meta", like "meta/experimental" and "meta/icu". With "provider", it's not clear what it is the provider for based only on the directory name. But ICU4X is a data-heavy project so it makes sense that it could be the data provider. We could also consider the directory to "data", which I acknowledge to be potentially confused for data artifacts.
  • @robertbastian - There are some provider crates that are user-facing.
  • @zbraniecki - But the provider crates are a means to an end.
  • @robertbastian - For me it's more important to read the example more than download and run the example. So I want examples to be super discoverable.
  • @sffc - For user-facing examples, we have a web site with tutorials.
  • @robertbastian - I would rather go to a crate, with a predictable file structure, than some weird web site. Also, we don't have a website
  • @zbraniecki - We have docs tests and tutorials as the most introductory level. And examples are diving in deeper into things like memory use.
  • @robertbastian - It would be nice to have examples that show how components interact. Some crates have 30 examples.
  • @zbraniecki - That type of suggestion make sense to me to go into the metacrate.

Q1: where should icu_experimental live:

  • icu_experimental (@robertbastian)
  • meta/experimental (@zbranicki)
  • components/experimental (@manishearth)
  • @sffc - It is a crate with modules. So I think components/experimental is my first choice. My second choice would be top level icu_experimental. I don't like a top-level meta.
  • @Manishearth - I see it as a component. So components/experimental, then meta/experimental. Top level distant third.
  • @robertbastian - Both icu and icu_experimental are meta crates, they contain multiple components. If you need an ICU component, it will be either in icu or icu_experimental, so those are our most important crates. They should be at the top level. Also when we move a component it moves from icu_experimental to icu.
  • @zbraniecki - I think I prefered the separate experimental crates, but I understand the technical reasons for merging them. In terms of where it lives, I see icu_experimental as a metacrate for which we don't publish separate crates.
  • @younies - For experimental components that are further along in development, can we move them into their own crate at that point?
  • @zbraniecki - I see an argument that something could move to its own component; can we keep it reop-only without publishing it on crates.io?
  • @robertbastian - The problem is icu_datagen depends on it, so it has to be released. We tried to do dev-dependencies only, but that didn't work.
  • @zbraniecki - For where it lives, I think in a directory called meta, as meta/experimental.

Q2: where should icu live:

  • icu (@robertbastian)
  • meta/icu (@zbraniecki)
  • components/icu (@manishearth)
  • @sffc - meta is a confusing/ambiguous term.
  • @Manisearth - icu at the top level sounds like a place where components might live, which is confusing.
  • @zbraniecki - We could make the directory meta we could make a README.md in that directory. We can do the same in the provider directory.
  • @sffc - icu_experimental is going to be a very busy crate with a lot of commits.

Q3: where should icu_datagen live:

  • provider/datagen
  • icu_datagen

Q3.5: where should icu_provider_macros live:

  • provider/macros
  • inside the icu_provider crate, like other macros that are exposed through other crates

Q4: where should icu_provider[_blob|_fs|] live

  • `provider/[blob|fs|core]
  • components/[provider|provider_fs|provider_blob]
  • icu_provider[_blob|_fs|]

Q5: where should the baked data crates live

  • provider/baked

robertbastian avatar Feb 08 '24 17:02 robertbastian

In general my opinion is being conservative on not changing the locations unless the new location is clearly more logical than the old location.

I think there is a decent case to relocate the following crates:

  1. icu_experimental because it is so new and we didn't really discuss it yet. OK with:
    • /components/experimental
    • /experimental/components
    • /experimental
    • /icu_experimental
  2. The metacrate, because it is small and there's a reasonable argument that it is not a component. OK with:
    • /components/icu (status quo is fine by me)
    • /icu
    • /metacrate
    • /metacrate/icu
  3. icu_provider_macros because of consistency with other proc macro crates. OK with:
    • /provider/macros (status quo)
    • /provider/core/macros

As far as the other crates inside /provider:

  • I'm not convinced that moving them to the top level or to /components is an improvement. As @zbraniecki pointed out, these are mostly things developers need after they start using ICU4X, and they don't contribute functionality on their own.
  • I would be open to renaming the /provider directory if there were a much better name, but I haven't heard one. /data has the problem that it looks like it contains static data artifacts, which is not the case here; I recall early on when we named the directory specifically choosing /provider instead of /data for this reason.

sffc avatar Feb 09 '24 02:02 sffc

I roughly agree with Shane's position above except that I don't like icu at the toplevel since it might be confusing with components.

Manishearth avatar Feb 09 '24 18:02 Manishearth

components/experimental and components/meta?

robertbastian avatar Feb 12 '24 11:02 robertbastian

Do we have consensus on

  • experimental/components -> components/experimental
  • experimental/{harfbuzz,ecma402} -> ffi/
  • experimental/{bies,ixdtf,zerotrie} -> utils/
  • provider/macros -> provider/core/macros

robertbastian avatar Feb 28 '24 14:02 robertbastian

I am okay with that route.

Manishearth avatar Feb 28 '24 16:02 Manishearth

I don't have an objection to https://github.com/unicode-org/icu4x/issues/4569#issuecomment-1969048536

sffc avatar Feb 29 '24 06:02 sffc

Discuss with:

  • @sffc
  • @robertbastian
  • @Manishearth
  • @zbraniecki

sffc avatar Feb 29 '24 18:02 sffc

  • @robertbastian - docs is confusing, because it doesn't just stand for documents, it also stands for documentation, which users might be looking for
  • @sffc - I w the documents folder to contain only .md files, so moving tutorials to the top level is good
  • @manishearth at the same time there will be people looking for our minutes, etc and having that be discoverable is important. just not under the name docs which is confusing

Conclusion:

  • Move tutorials to top level
  • Rename docs to documents or something similar
  • Consider cleaning up the folder a bit

LGTM: @manishearth @sffc @robertbastian

Manishearth avatar Mar 15 '24 14:03 Manishearth