phenopacket-schema icon indicating copy to clipboard operation
phenopacket-schema copied to clipboard

Update protoc generation pipeline

Open iimpulse opened this issue 1 year ago • 12 comments

I write this with a heavy heart knowing that this change will (most likely) be massive. Currently we are generating the protocol buffers with the maven plugin: protobuf-maven-plugin.

It appears that this plugin is no longer an active repository (~2 years). While I can appreciate the simplicity to use this type of generation it might be in our best interest to move away from this.

I propose that we configure our own pipelines for generating the different language artifacts and publishing them to the respective language repositories to support each language that protoc supports.

I am happy to take on this work but the progress will be slow.

iimpulse avatar Apr 10 '23 19:04 iimpulse

But the plugin appears to be working? Is something broken? The maven code basically just needs to call the google library so maybe there is not much need to update the maven plugin?

pnrobinson avatar Apr 10 '23 19:04 pnrobinson

What if we want to expand this to other languages? I have 2 such requests already (typescript, rust). Do we fork and extend the library or do we just make this repository call these google libs directly?

iimpulse avatar Apr 10 '23 20:04 iimpulse

Are you referring to making a new maven plugin that will call the google libs or a replacement for the google libs? Not sure I understand?

pnrobinson avatar Apr 10 '23 20:04 pnrobinson

We could fork the existing protobuf-maven-plugin and add support for the 2 languages or we can implement a shell like command for maven that will call protoc to generate bindings for the languages we want to support (all that protoc support ideally).

iimpulse avatar Apr 10 '23 23:04 iimpulse

protoc doesn't support Rust or TS (see supported languages), so replacing the maven plugin won't help us with this. protoc Rust is apparently being worked on so this will need an update to the plugin whenever that is ready. Alternatively there are about three different Rust protobuf compilers out there which could be used but would require a separate build script or cargo project.

julesjacobsen avatar Apr 13 '23 11:04 julesjacobsen

I've made a quick Rust implementation of how this could be done - 7a7381b it's possibly not perfect, but it might be good enough?

julesjacobsen avatar Apr 14 '23 16:04 julesjacobsen

Changes are in the rust-build branch

The basic functionality is there but there needs to be a bit more work to check it's ready for publishing on crates.io e.g. https://doc.rust-lang.org/cargo/reference/publishing.html

To test this you will need to create a new project e.g. cargo new phenopackets-rs add these dependencies to Cargo.toml

[dependencies]
phenopacket-schema = { git = "https://github.com/phenopackets/phenopacket-schema", branch = "rust-build", version = "2.0.2" }
pbjson-types = "0.5.1"
serde_json = "1.0.96"
serde_yaml = "0.9.21"

then in main.rs

use pbjson_types::Timestamp;
use phenopacket_schema::org::phenopackets::schema::v2::core::{Individual, PhenotypicFeature, Sex, TimeElement};
use phenopacket_schema::org::phenopackets::schema::v2::core::OntologyClass;
use phenopacket_schema::org::phenopackets::schema::v2::core::time_element::Element;
use phenopacket_schema::org::phenopackets::schema::v2::Phenopacket;

fn main() {
    let ontology_class = OntologyClass { id: "HP:0000001".to_string(), label: "Phenotypic abnormality".to_string() };
    let mut phenotypic_feature = PhenotypicFeature::default();
    phenotypic_feature.r#type = Some(ontology_class);

    let mut phenotypic_features = vec!(phenotypic_feature);

    let mut individual = Individual::default();
    individual.id = "subject1".to_string();
    individual.sex = Sex::Male.into();
    individual.time_at_last_encounter = Some(TimeElement { element: Some(Element::Timestamp(Timestamp { seconds: 1681483297, nanos: 0 })) });

    let mut phenopacket = Phenopacket::default();
    phenopacket.id = "example1".to_string();
    phenopacket.subject = Some(individual);
    phenopacket.phenotypic_features.append(&mut phenotypic_features);

    let j = serde_json::to_string_pretty(&phenopacket).unwrap();
    println!("{}", j);

    let deserde = serde_json::from_str(&j).unwrap();
    assert_eq!(phenopacket, deserde);

    let yaml = serde_yaml::to_string(&phenopacket).unwrap();
    println!("{}", yaml);
}

and you should see the output:

{
  "id": "example1",
  "subject": {
    "id": "subject1",
    "timeAtLastEncounter": {
      "timestamp": "2023-04-14T14:41:37+00:00"
    },
    "sex": "MALE"
  },
  "phenotypicFeatures": [
    {
      "type": {
        "id": "HP:0000001",
        "label": "Phenotypic abnormality"
      }
    }
  ]
}
id: example1
subject:
  id: subject1
  timeAtLastEncounter:
    timestamp: 2023-04-14T14:41:37+00:00
  sex: MALE
phenotypicFeatures:
- type:
    id: HP:0000001
    label: Phenotypic abnormality

julesjacobsen avatar Apr 17 '23 09:04 julesjacobsen

Will need to change README.rst into markdown for crates.io Markdown is also compatible with PyPI

julesjacobsen avatar Apr 17 '23 10:04 julesjacobsen

@iimpulse @pnrobinson Do you have any Rustacians who might want to test this? It works on my machine (TM).

julesjacobsen avatar Apr 18 '23 21:04 julesjacobsen

Hi @julesjacobsen, @justaddcoffee pinged me about the code snipped - what aspects do you need to test? Simply the execution?

LucaCappelletti94 avatar Apr 19 '23 07:04 LucaCappelletti94

Just tested. Looks great @julesjacobsen thanks a ton.

iimpulse avatar Apr 19 '23 16:04 iimpulse

@LucaCappelletti94 it should work fine as I've tested that as have Mike and Justin. I was more hoping for a critical rusty eye to check on package layout and perhaps suggest how to automate the mod.rs build part as that was a manual step to include the serde impl.

julesjacobsen avatar Apr 19 '23 21:04 julesjacobsen