blockchain-carbon-accounting icon indicating copy to clipboard operation
blockchain-carbon-accounting copied to clipboard

modify fabric to calculate emissions from REST api

Open sichen1234 opened this issue 2 years ago • 23 comments

Modify fabric to get emissions factors from postgres database in data/postgres/ instead of directly from Fabric's couchdb, possibly by specifying ip address/port connection to postgres or using a docker container if really necessary.

sichen1234 avatar Mar 18 '22 00:03 sichen1234

After #500, once the data is loaded into postgres, modify the chaincode that is accessing this seed data, for example in https://github.com/hyperledger-labs/blockchain-carbon-accounting/blob/main/emissions-data/chaincode/emissionscontract/typescript/src/lib/emissionsFactor.ts, to access it directly from postgres database.

As a test try accessing any postgres database from chaincode and see if it works.

sichen1234 avatar Apr 13 '22 16:04 sichen1234

@sichen1234 getting caught up on this issue as it relates to the data integration mentorship assigned to @Ackintya. This blog discusses support for other state DBs, and highlights that Fabric only support couchDb or levelDB natively. He mentions a working document on using GO plugins for pluggable ledger state databases. Is this what you have in mind? If so we need to put together a plan as the solution requires forking Fabric. (they recommend just using couch or levelDb)

brioux avatar Jun 15 '22 16:06 brioux

I don't want to use a different database with Fabric, only that the chain code be able to access an external data source. All the Fabric transactions should stay on whatever database it uses.

In the very beginning we got the chain code to connect to Amazon's Dynamodb using external API calls.

Can we set it up so the chain code can access postgres as well to get the emissions factors?


Si Chen Open Source Strategies, Inc.

*Why aren't we decarbonizing when it's profitable? How can we fix it? See our Blog https://www.opensourcestrategies.com/2022/04/13/using-the-blockchain-for-supply-chain-decarbonization-with-emissions-transfers/ and White Paper https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4082449 *

On Wed, Jun 15, 2022 at 9:30 AM Bertrand Rioux @.***> wrote:

@sichen1234 https://github.com/sichen1234 getting caught up on this issue as it relates to the data integration mentorship https://wiki.hyperledger.org/display/INTERN/Multiple+Data+Integration+to+Fabric+Climate+Accounting+Network assigned to @Ackintya https://github.com/Ackintya. This blog https://wiki.hyperledger.org/display/INTERN/Multiple+Data+Integration+to+Fabric+Climate+Accounting+Network discusses support for other state DBs, and highlights that Fabric only support couchDb or levelDB natively. He mentions a working document on using GO plugins for pluggable ledger state databases https://docs.google.com/document/d/1ZdxPWdxUwEDwRAKY8tewgjqh2qTZCm6RZ6tU9ccm4dM/edit#heading=h.supvt45riocz . Is this what you have in mind? If so we need to put together a plan as the solution requires forking Fabric https://lists.hyperledger.org/g/fabric/topic/replacing_couchdb_with_a/74953512. (they recommend just using couch or levelDb)

— Reply to this email directly, view it on GitHub https://github.com/hyperledger-labs/blockchain-carbon-accounting/issues/501#issuecomment-1156685900, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANAS4LKWMDW5A5DENWIGOLVPIAKJANCNFSM5RANDD2A . You are receiving this because you were mentioned.Message ID: <hyperledger-labs/blockchain-carbon-accounting/issues/501/1156685900@ github.com>

sichen1234 avatar Jun 15 '22 16:06 sichen1234

Ok, I understand better the issue now. The emissions will not be stored on the Fabric state DB (i.e., couch), instead the chaincode calls an external sevice (i.e., postgreSQL DB). This makes sense as it will avoid loading entire datasets into Fabric stateDb.

brioux avatar Jun 15 '22 17:06 brioux

In this case I think a good first task for @Ackintya is to work on revising the chaincode to pull data from an external resource.

He can start by working on modifying the emissions chaincode to connect to the external postgreSQL database where all the emissions data is now being loaded.

This directory will replace the data loaded into fabric couchDB using egrid-data-loader.

brioux avatar Jun 15 '22 17:06 brioux

@Ackintya here is an example of where the emissions chaincode will need to be modified. The function getEmissionsRecord of the EmissionsRecordState should query external Postgres DB for the uuid of the emission record, instead of the internal stateDB (couch).

brioux avatar Jun 15 '22 17:06 brioux

I have been thinking about this issue over the weeknd. There are the two approaches to using an external database to access emission records.

  1. This new approach: the chaincode is designed to query records using an API call. each fabric peer has to submit a query and expects the API to return same result.
  2. Old approach: An organization queries the eternal database (e.g., postgres) before calling the chaincode. The organization requests peers to store the record on the Fabric internal state DB (e.g., importUtilityFactor chaincode function). Peers handle audit requests by querying the shared state DB, no API calls.

brioux avatar Jun 19 '22 16:06 brioux

@Ackintya, @sichen1234 is asking to implement 1. The organization tells the network where to get the data from, or the API address/functions are hard coded into the chaincode.

I read this threads that warns setting up external API calls inside Fabric could result in consensus issues, i.e., if peers receive different results.

There is still Fabric documentation on how to do this here, but may not be updated for recent versions. If we take this route, we need to research this further.

This can be an issue if running a network with a large number of peers and access to the external service is unstable -> peers don't receive the same result. With only a few organizations/peers on the audit channel, and stable connection to the external DB, this should not be a major issue.

If we stick to the old approach, my understanding is that the only difference from how the Fabric network is currently setup is that all the emission records are not written directly to the stateDB (e.g., using the egrid-data.load.sh script). Each organization can setup is own connection to an external emission database (e.g., the postgres data-loader). Only records submitted for audit are written to the state DB (so peers do not have to query the external service).

brioux avatar Jun 19 '22 16:06 brioux

Benefits of the new approach (assuming consensus is not an issue).

  1. The chaincode can be configured to whitelist recognized/trusted emission record APIs
  2. No need for internal state DB to replicate existing emission record database

brioux avatar Jun 19 '22 16:06 brioux

I don't think we need to try to fix the consensus issue. A lot of smart contract code execution relies on some external service to provide them data. Oracles are designed to address the reliability of the external data. We can deal with that when we get there.

This new approach to get emissions factors is better in that it keeps the Fabric database only for transaction records, and keeps the external data out of it.

Si Chen Open Source Strategies, Inc.

*Why aren't we decarbonizing when it's profitable? How can we fix it? See our Blog https://www.opensourcestrategies.com/2022/04/13/using-the-blockchain-for-supply-chain-decarbonization-with-emissions-transfers/ and White Paper https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4082449 *

On Sun, Jun 19, 2022 at 9:44 AM Bertrand Rioux @.***> wrote:

Benefits of the new approach (assuming consensus is not an issue).

  1. The chaincode can be configured to whitelist recognized/trusted emission record APIs
  2. No need for internal state DB to replicate existing emission record database

— Reply to this email directly, view it on GitHub https://github.com/hyperledger-labs/blockchain-carbon-accounting/issues/501#issuecomment-1159773198, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANAS4OGOFW4TMDXYXE6M6DVP5E63ANCNFSM5RANDD2A . You are receiving this because you were mentioned.Message ID: <hyperledger-labs/blockchain-carbon-accounting/issues/501/1159773198@ github.com>

sichen1234 avatar Jun 20 '22 15:06 sichen1234

These commits have code that show how to access amazon dynamodb from chain code: 74ec825b2178de0b079e4fe565f245d4f2286704 [74ec825b] b5ffcfece12dc271eea5bba6f8dfdbc958391e1a [b5ffcfec] 7c5c3fabbad18509a752a146a262f2fdb6197825 [7c5c3fab] c21a1c4e6d25f37e895b6dae831bcf6836ed8a61 [c21a1c4e]

This could be a good example for accessing external data sources from chain code, even if we're using postgres instead of dynamodb.

sichen1234 avatar Jun 20 '22 19:06 sichen1234

Based on discussions this morning, it's better to create a REST api server to provide emissions calculations based on lib/emissions_data/src/lib/emissions-calc.ts Then the Fabric chain code will call this REST api to get the emissions and record them on the Fabric network.

This will simulate working with an external oracle service or API service that calculates emissions but does not provide the emisisons factors. It will also reuse the code in lib/emissions_data/src which is used by other apps like supply-chain.

sichen1234 avatar Jun 23 '22 20:06 sichen1234

@Ackintya I am looking at your modifications to the emissionsRecordContract.ts

First, the rest API should expect from the Fabric chaincode any cmd + arguments combination to be relayed to the external DB server, and expect a CO2EmissionFactorInterface object as response. This is stored in the co2Emission variable in the Fabric chaincode.

The emissionsRecordContract.ts chaincode file will no longer access the following from the Fabric state DB: getUtilityLookupItem getEmissionsFactorByLookupItem These calls should be dropped, this is all stored in the external DB.

There are different ways to get a CO2EmissionFactorInterface object

  1. Use existing postgres server cmd npm run pg:getData activity-emissions <scope> <level1> <level2> <level3> <level4> <text> <amount> [uom] that calls getCO2EmissionByActivity.
  2. Can also replicate how the chaincode currently gets emissions using utilityId: getUtilityLookupItem -> getEmissionsFactorByLookupItem -> getCO2EmissionFactor. However, this logic needs to be performed by the postgres DB server not Fabric. The methods have already been moved to data/src/repositories (see the links...) E.g., create a new pg command like npm run pg:getData utility-id-emissions <utilityId> <amount> [uom] that would getCO2EmissionByUtilityId.

In both cases the Fabric user has to tell the rest API to send the command and attributes to the server operating the external DB. This requires modifying inputs of recordEmissions.

brioux avatar Jul 05 '22 02:07 brioux

As an example we can use activity-emissions cmd to get data for uuid 3622b20d-1e94-4490-ba8b-6e0b73910e2 by sending the following args scope: "SCOPE 2" level1: "eGRID EMISSIONS FACTORS" level2: "USA" level3: "STATE: VA"

or use getUtilityLookupItem(uuid) directly.

brioux avatar Jul 05 '22 03:07 brioux

I definitely think the better way to do is @brioux's option 1. The chain code should call the API, which should call getCO2EmissionByActivity.

sichen1234 avatar Jul 06 '22 15:07 sichen1234

@Ackintya Pls look in lib/src/emissions-utils.js process_electricity method. It maps the utility fields into the emissions factors fields. You can call this method from the REST API directly and map the fields to its input, or follow its logic to call the utility item lookup and then map the output to call get emissions factors. Since we're just getting electricity emissions in the Fabric chain code, it might be better to call process_electricity directly.

What do you think, @brioux

sichen1234 avatar Jul 13 '22 18:07 sichen1234

@sichen1234 clarification first i think you mean call lib/supply-chain/src/process_electricity...

@Ackintya, you can use your REST API (oracle) to call DB directly. Sorry I made a mistake, there is no need to use app/api-server.

You can use process_electricity as suggested above or any other function that uses the EmissionsFactorRepo, including the getCO2EmissionByActivity method we identified originally.

FYI - PostgresDBService.getInstance() establishes connection to the potgresDB.

Make sure your .env variables are configured to the values used by your postgres database if they are different from the default values set in data/src/config.ts

brioux avatar Jul 14 '22 15:07 brioux

@sichen1234 These two functions (process_electricity and getCO2EmissionByActivity) expect inputs (ElectricityActivity and ActivityInterface) that do not align with what Fabric chaincode requests from an organization, i.e. recordEmissions(utilityId .....

The chaincode requires utilityID to get emissions-factor (corresponds to uuid in pg table). will need to update the chaincode inputs and higher level functions (e.g., swagger API) to accommodate different emission calculation requests (host and query/calc arguments) .

@Ackintya to avoid having to change the Fabric chaincode FOR NOW, you can setup a new EmissionsFactorRepo method similar to getCO2EmissionByActivity that is called by the Oracle. It would require only the uuid to query the emission-factor table using getEmissionFactor = async (uuid: string), not the activity data.

brioux avatar Jul 14 '22 16:07 brioux

@Ackintya keep in mind irrespective of the source DB and method used, results should be converted into a general type validated by the Oracle based on the requirements of the Fabric chaincode. E.g., ActivityResult by process_electricity or CO2EmissionFactorInterface by getCO2EmissionByActivity.

brioux avatar Jul 14 '22 16:07 brioux

You're right. There's no reason that the Fabric chain code only handles electricity. We could generalize it more.

I think the best way is to set up a REST API endpoint for each process_ method in

lib/supply-chain/src/emissions-utils.ts

The current method in chain code also outputs more data, like the percentage of renewables. If we need that I think we should modify it in the lib/supply-chain/src/emissions-utils.ts and then have the output come out of the REST API. Then we can store it in Fabric. Or, the results could be stored as metadata on ERC tokens.


Si Chen Open Source Strategies, Inc.

Why open source and blockchain for carbon accounting? Video https://www.youtube.com/watch?v=eNM7V8vQCg4 and Blog Post https://www.opensourcestrategies.com/2022/06/01/why-open-source-carbon-accounting/

On Thu, Jul 14, 2022 at 9:17 AM Bertrand Rioux @.***> wrote:

One more comment regarding the outputs of (process_electricity and getCO2EmissionByActivity) ActivityResult https://github.com/hyperledger-labs/blockchain-carbon-accounting/blob/ea57504d24d87615fd60ecc5fe0d0f4d2bb5aef8/lib/supply-chain/src/common-types.ts#L88 CO2EmissionFactorInterface https://github.com/hyperledger-labs/blockchain-carbon-accounting/blob/ea57504d24d87615fd60ecc5fe0d0f4d2bb5aef8/lib/emissions_data/src/emissions-calc.ts#L42

both including emissions data that should be returned to Fabric. However the Oracle should always expect a result with the same structure irrespective of where it comes from. I suggest sticking to something like CO2EmissionFactorInterface as a general emissions_data type, whatever the source.

@sichen1234 https://github.com/sichen1234 for this reason I would suggest using getCO2EmissionByActivity. Also, while the Fabric utility-emission channel was designed for electricity don't we want to generalize it to serve to any emission source?

— Reply to this email directly, view it on GitHub https://github.com/hyperledger-labs/blockchain-carbon-accounting/issues/501#issuecomment-1184636891, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANAS4OUNM6W3TLUS5DZRA3VUA4TLANCNFSM5RANDD2A . You are receiving this because you were mentioned.Message ID: <hyperledger-labs/blockchain-carbon-accounting/issues/501/1184636891@ github.com>

sichen1234 avatar Jul 14 '22 21:07 sichen1234

@Ackintya to avoid having to change the Fabric chaincode FOR NOW, you can setup a new function, e.g., getCO2emissionsByUilityId(utilityId, thruDate, activity_uom, activity_amount)

It requires only the utilityId, uuid of utility_lookup_item table, and thruDate to query the emission-factor, instead of activity data.

Replicate what the Fabric chaincode does:

  1. Use utilityLookup = db.getUtilityLookupItemRepo().getUtilityLookupItem(utilityId) method from data/src/repositories/utilityLookupItem.repo.ts to get a UtilityLookupItemInterface object.
  2. Then use Use emissionFactor = db.getEmissionsFactorRepo().getEmissionsFactorByLookupItem(utilityLookup,thruDate) method from data/src/repositories/utilityLookupItem.repo.ts.

Finally calculate the emissions for the activity to return to Fabric!

brioux avatar Jul 14 '22 23:07 brioux

Wouldn't it be better to call the REST API from inside chain code, so that later we can substitute it for another API if needed?

Si Chen Open Source Strategies, Inc.

Why open source and blockchain for carbon accounting? Video https://www.youtube.com/watch?v=eNM7V8vQCg4 and Blog Post https://www.opensourcestrategies.com/2022/06/01/why-open-source-carbon-accounting/

On Thu, Jul 14, 2022 at 8:14 AM Bertrand Rioux @.***> wrote:

@sichen1234 https://github.com/sichen1234 clarification first i think you mean call lib/supply-chain/src/process_electricity https://github.com/hyperledger-labs/blockchain-carbon-accounting/blob/ea57504d24d87615fd60ecc5fe0d0f4d2bb5aef8/lib/supply-chain/src/emissions-utils.ts#L371 ...

@Ackintya https://github.com/Ackintya, you can use your REST API (oracle) to call DB directly using process_electricity (I made a mistake, there is no need to use the api-server).

async function getDBInstance() { return await PostgresDBService.getInstance(); } https://github.com/hyperledger-labs/blockchain-carbon-accounting/blob/ea57504d24d87615fd60ecc5fe0d0f4d2bb5aef8/lib/supply-chain/src/emissions-utils.ts#L25 establishes connection to the potgresDB.

Make sure your .env variables are configured to the values used by your local postgres database if they are different than the default values set in data/src/config.ts https://github.com/hyperledger-labs/blockchain-carbon-accounting/blob/ea57504d24d87615fd60ecc5fe0d0f4d2bb5aef8/data/src/config.ts#L43

— Reply to this email directly, view it on GitHub https://github.com/hyperledger-labs/blockchain-carbon-accounting/issues/501#issuecomment-1184569568, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANAS4MBI3KDYKDBDEVQ54LVUAVE3ANCNFSM5RANDD2A . You are receiving this because you were mentioned.Message ID: <hyperledger-labs/blockchain-carbon-accounting/issues/501/1184569568@ github.com>

sichen1234 avatar Oct 11 '22 07:10 sichen1234

The approach adopted in PR 616 was to introduce a new oracle api into the chaincode, rather than an explicit connection to the existing rest API.

The oracle can then be configured to relay connections to an approved rest-api with the required DB connection.

For now the DB connection was hardcoded into the Oracle API. The calls to the db repositories required by the fabric chaincode were not yet set up in the postgres rest-api. It needs to be updated.

I.e., https://github.com/hyperledger-labs/blockchain-carbon-accounting/blob/25bc6eb48c61ea21e7d1485b9e5da80f36ec8533/app/api-oracle/postgresApi.ts#L63

The chaincode and swagger-api tests were set up to get emissions by lookup item, and not the newer 'getEmissions' by activity added to the rest-api trpc routers. https://github.com/hyperledger-labs/blockchain-carbon-accounting/blob/25bc6eb48c61ea21e7d1485b9e5da80f36ec8533/app/api-server/trpc/emissions-factors.trpc.ts#L135

I am looking for a candidate to :

  1. work on modifying the chaincode and oracle api to request emission records using existing rest-api routers,
  2. extend the rest-api to handle the original emission requests setup within the fabric chaincode.

brioux avatar Oct 11 '22 16:10 brioux