mime icon indicating copy to clipboard operation
mime copied to clipboard

1.0 Release Discussion / Wishlist

Open abonander opened this issue 8 years ago • 15 comments

With Hyper on its way to 1.0, perhaps it's time to start discussing where this crate is at in relation to its own eventual 1.0 release, as well as create a place to document potential blockers for said release. That said, I haven't looked too closely at this crate lately so I don't know what any potential blockers might be.

Thoughts, @seanmonstar?

abonander avatar Jun 10 '17 01:06 abonander

I just did a large redesign of this crate, resulting in v0.3. I realize reqwest is on the road to 1.0, and it publicly relies on mime, so there does need to be a way forward. I feel that the new design 0.3 is one that can be stabilized, with one possible caveat. I'm uncertain if I'd rather all the mime constants to be on the Mime type, or on the crate.

Should you do ContentType::new(mime::TEXT_CSV), or ContentType::new(Mime::TEXT_CSV). If the latter, then since associated constants won't stabilize until Rust 1.20, might have to wait till then.

Besides that, I think there just needs some usage out of 0.3, which should happen much more so when hyper releases 0.11, which contains the upgrade.

seanmonstar avatar Jun 10 '17 18:06 seanmonstar

From a high-level perspective, I think I would prefer to see the constants contained within a submodule, perhaps namespaced by top-level type, e.g.:

/// `application/_` mime types
pub mod application {
    pub const NAME = Name { "application" }; // maybe?
    pub const WILDCARD = Mime { "application/*" }; // or STAR
    pub const OCTET_STREAM = Mime { "application/octet_stream" };
}

// ..

/// `multipart/_` mime types
pub mod multipart {
    pub const NAME = Name { "multipart" };
    pub const FORM_DATA = Mime { "multipart/form-data" };
    pub const ATTR_BOUNDARY = Name { "boundary" };
}
// ...

This would help avoid polluting the crate root namespace as more constants are added.

I think as far as desired APIs go, I do miss the ability to construct a Mime in a strongly typed manner, e.g. Mime(TopLevel::Multipart, SubLevel::FormData, vec![Attr::Boundary("boundary".to_string())]. Maybe add a method like Mime::add_param() which returns a new Mime with the given parameter and value appended? It can be done manually, of course, but having it as a method could avoid completely reparsing the MIME type and reduce potential mistakes as well.

A specialization of ToString which skips the formatting framework would be neat as well, but that can be added backwards-compatibly when specialization stabilizes.

abonander avatar Jun 10 '17 19:06 abonander

  • Using modules to group by top level media type is an interesting idea! Seeing mime::text::CSS and things is neat.

    There could be some confusing around which constant is a Mime and which is a Name. For instance, would this confuse a user: mime::JSON vs mime::application::JSON? Is that a situation where it'd be better to be Name::JSON?

    I'm not sure whether the current situation really "pollutes" the crate root. Using a flat design versus modules saves 1 character of typing per constant mime::TEXT_PLAIN vs mime::text::PLAIN, not much to bother considering. Maybe if you were looking in the docs, you'd want to look for constants that you can use with only application types, or something?

  • Constructing a Mime without strings: I delayed this to get 0.3 finished sooner. Designing methods to handle both optional suffixes and parameters might warrant something like a builder... I imagine with macros 2.0, a mime! macro can be used to both construct a Mime by combining Names, and also by parsing strings at compile time. Examples of this macro:

    • let form_data = mime!(MULTIPART / FORM_DATA; BOUNDARY = my_rando_str);
    • const GITHUB: Mime = mime!(APPLICATION / "vnd.github.api.v3" + JSON);
    • const INVENTED: Mime = mime!("sean/monstar; foo=bar");
  • Mutating a Mime: I also skipped this to get 0.3 released. The actual mutating isn't too difficult, update the inner source and any stored indices. However, like setting the boundary parameter with a string, it would need to be a fallible method. Not all characters are valid in a Mime. So either fn add_param() -> Result<(), ParseError>, or maybe adding a FromStr for Name, and then you call mime.add_param(BOUNDARY, my_rando_str.parse()?);.

seanmonstar avatar Jun 11 '17 17:06 seanmonstar

I'm not sure whether the current situation really "pollutes" the crate root.

Currently? Not really, I guess. If you plan to add a large number of constants more, maybe. Are you going to keep it about the same? Just constants for the most used types?

Maybe then, just a few submodules would do, though it would make the absolute paths longer:

pub mod attrs {
    pub const BOUNDARY = Name { "boundary" };
    pub const CHARSET = Name { "charset" };
}

// Is having `Name` constants strictly necessary since they can be converted to and from strings at nearly zero cost?
pub mod names {
    pub const APPLICATION = Name { "application" };
    // ...
    pub const MULTIPART = Name { "multipart" };
    // ...
    pub const GIF = Name { "gif" };
}

pub mod types {
    pub const APPLICATION_JSON = Mime { "application/json" };
    pub const APPLICATION_OCTET_STREAM = Mime { "application/octet-stream" };
    // ...
    pub const IMAGE_WILDCARD = Mime { "image/*" };
    pub const IMAGE_GIF = Mime { "image/gif" };
    // ...
    pub const MULTIPART_FORM_DATA = Mime { "multipart/form-data" };
}

You could use proc-macro-hack to get a compile-time macro now, however I don't see much point in combining Names since their underlying strings will have to be concatted anyway. It's slightly safer but I still miss the old API design; I understand why it was difficult to work with and extend though.

abonander avatar Jun 11 '17 22:06 abonander

If you plan to add a large number of constants more, maybe. Are you going to keep it about the same? Just constants for the most used types?

Undecided. I'm sure there's a few other fairly common that I have forgotten. I'm not sure I'd want to just copy the IANA registry into this crate. With a macro to make new static Mimes, that feels unnecessary.

Is having Name constants strictly necessary since they can be converted to and from strings at nearly zero cost?

Well, having constants instead of people typing strings does allow the catching of typos. mime.type_() == mime::APPLICATON will error, mime.type_() == "applicaton" will not. As to them being of type Name instead of &str, not all names are treated the same. Specifically, most names are case insensitive, so mime.type_() == "TEXT" and mime.get_param("charset").unwrap() == "UTF-8" should work.

You could use proc-macro-hack to get a compile-time macro now

I did start a macro trying to do just that, but hit 2 problems, 1 probably solvable, another not.

  1. The hack defines an item, enum ProcMacroHack { ... }, and so you cannot use it multiple times in the same scope. I would consider trying to find a solution, except for number 2.
  2. To allow the macro to construct a Mime outside of the mime crate, the fields would need to be public. That would mean libraries can no longer rely on the a constructed Mime to only have valid characters. Macros 2.0 already are executed in scope of the defining crate, so they can access private fields, without having to expose them to the world.

I don't see much point in combining Names since their underlying strings will have to be concatted anyway.

With a compile time macro, they can be concatenated into a single static string, so the there's no runtime cost. And by allowing things like mime!(APPLICATION / CSV), you can again get typo checks instead of using a string.

seanmonstar avatar Jun 12 '17 16:06 seanmonstar

There was some discussion in #55 about splitting Mime into two types, MediaType and MediaRange, such that text/* is a valid MediaRange, but not a valid MediaType. I like the concept. Would be a breaking change...

seanmonstar avatar Aug 23 '17 18:08 seanmonstar

Makes sense.

abonander avatar Aug 23 '17 18:08 abonander

You might also want to consider adding some support for RFC2231. Currently you can parse/uses MIME conforming with the RFC, but you have to put extra work on top for using such mimes.

A example from the RFC would be:

Content-Type: application/x-stuff;
    title*0*=us-ascii'en'This%20is%20even%20more%20;
    title*1*=%2A%2A%2Afun%2A%2A%2A%20;
    title*2="isn't it!"

By adding a * to the end of you parameter name (e.g. filename => filename*) you switch to a slightly different format for the values, which:

  1. add a charset field followed by a language field separated by ' from the rest
  2. instead of quoting you now use percent encoding to encode from the charset specified into us-ascii valid for an parameter value

Additionally they add a mechanism to split long parameter values into multiple sections by adding a *N (e.g. *0,*1,...) to the end of the parameter name, but before any possible added * for encoding. Note that only the first section contains a charset/language field, following sections do not. Also following sections do not need to use encoding, through if encoding is used in any section the first section has to be encoded.

Like I said before this crate does parse such definitions just fine, but it would be interesting to have e.g. a get_rfc2231_param, method which iterates over all sections of a given parameter and preferable does so in the section order (which can differ from the order they appear in the MIME field). (e.g. a ~ Iter<Item=(usize, bool, Name)> where the bool indicates, if a section is encoded i.e. the full parameter name ended in a * and usize if the number of the section, starting with 0)

Similar nice would be a way to construct such MIMES, through this would mean pulling in percent_encoding dependencies.

For now I will implement some of the part on top of Mime for my use case. I can post them here when done or, if they don't contain any application specific logic, make a pull request.

rustonaut avatar Sep 19 '17 10:09 rustonaut

One question that's come up during the crate evaluation so far is whether mime is the best name for the crate. Since we're just dealing with media types then maybe media-types is more descriptive?

Splitting into MediaType and MediaRange seems like a good idea. I'm also thinking that if we did that then maybe it also makes sense to move the constants on to those respective types? That one seems like it's 6-of-one or half-a-dozen-of-the-other to me though.

KodrAus avatar Oct 23 '17 01:10 KodrAus

Here are some other thinks which I think are a good idea:

  1. do not use Name for both names and values. All "names" are can be compared case insensitive but only value of the charset param is compared case insensitive.
  2. do not provide as_str for the "new (1.)" Value type as quoted-pairs (e.g. \") can not be handled correctly with it instead provide as_str_repr(&self) -> &str (for the underlying representation) and to_content(&self) -> Cow<'a, str> for the actual content (comparasions can be implemented content based without needing any allocation even if quoted-pairs are used.
  3. maybe drop special handling of the charset parameter. While charsets are case insensitive, this is a detail on a different abstraction level and case insensitive comparison is not enough at all. With other words if someone want's to compare the person has to use an additional API anyway which knows about alias names for charsets (e.g. utf8 == utf-8). Any such API would implement case insensitive comparison by themselves, so no need to implement it here.
  4. consider if it is possible to parse a "text/plain" into the same representation as TEXT_PLAIN, by e.g. using a trie data struct containing the most common type/subtypes (speedier comparisons matching etc. wrt. to parsed mime-types and mime-ranges, similar parsing speed (I guess) but higher binary size)

rustonaut avatar Nov 01 '17 17:11 rustonaut

Would you consider cutting a new pre-1.0 release, so that I can use the serde feature in my create.

richard-uk1 avatar Jan 20 '20 14:01 richard-uk1

Any status on this a year later? Serde support still isn't available.

novacrazy avatar Feb 23 '21 05:02 novacrazy

It'd be nice to have serde support by now.

orhun avatar Aug 09 '21 10:08 orhun

The ability for applications/consumers to define const Mime instances would be super valuable.

All the APIs used to define the ones shipped in this crate are private. If someone needs to use a single mime-type but it's not shipped here, there doesn't seem to be any obvious workaround except "create a new instance each time you need a reference to one", which completely negates all the performance optimizations.

WhyNotHugo avatar Mar 22 '23 21:03 WhyNotHugo