GenAIExamples icon indicating copy to clipboard operation
GenAIExamples copied to clipboard

[bug] ChatQnA Security Assessment (It is not a Security Audit)

Open dehatideep opened this issue 11 months ago • 17 comments

Priority

Undecided

OS type

Other (Please let us know in description)

Hardware type

CPU-other (Please let us know in description)

Installation method

  • [ ] Pull docker images from hub.docker.com
  • [ ] Build docker images from source

Deploy method

  • [X] Docker compose
  • [X] Docker
  • [X] Kubernetes
  • [X] Helm

Running nodes

Single Node

What's the version?

2b2c7ee2f5221432dc6020d006436b380a00e52e

Description

ChatQnA security Assessment: https://docs.google.com/document/d/1df20UOmqJ_30VW5i6MajxXbJn3KhwVHfxYo3oGt2W5o/edit?usp=sharing

Reproduce steps

See the security assessment details. These are based upon code reading.

Raw log

See the security assessment details. These are based upon code reading.

dehatideep avatar Dec 02 '24 06:12 dehatideep

@arun-gupta https://github.com/opea-project/GenAIExamples/issues/1220

dehatideep avatar Dec 02 '24 06:12 dehatideep

@arun-gupta Do you have comments?

feng-intel avatar Dec 03 '24 06:12 feng-intel

I was having a discussion with folks in the CNCF AI/ML working group. @dehatideep offered to help out with the security review of the OPEA samples. This is a result of that. That is all the context that I can provide.

It would be useful to look at these recommendations and see how we can improve the security of ChatQnA, and possibly other samples.

arun-gupta avatar Dec 03 '24 15:12 arun-gupta

@dehatideep

The assessment is really helpful! Thank you @dehatideep!

We will review it carefully and take some actions.

BR, Xigui

xiguiw avatar Dec 17 '24 08:12 xiguiw

I was having a discussion with folks in the CNCF AI/ML working group. @dehatideep offered to help out with the security review of the OPEA samples. This is a result of that. That is all the context that I can provide.

It would be useful to look at these recommendations and see how we can improve the security of ChatQnA, and possibly other samples.

Yes, definitely we'll review these recommendations and improve GenAI examples, I believe not only ChatQnA, but also other examples.

xiguiw avatar Dec 17 '24 08:12 xiguiw

@dehatideep

We are starting to fix problems pointed out by you. Here are something I am not sured.

Would you please help for some details in ChatQnA security Assessment: https://docs.google.com/document/d/1df20UOmqJ_30VW5i6MajxXbJn3KhwVHfxYo3oGt2W5o/edit?usp=sharing?

ChatQnA/docker_compose/install_docker.sh /install_docker.sh

Is the intention of keeping it readable to all by design? If not it should be restricted to 600, despite the fact that it is a public trusted key.

I did not get your point. Would you please give some more details. Thanks!

  1. ChatQnA/benchmark/accuracy/eval_crud.py

default="http://localhost:8888/v1/chatqna"

Default URL being http suggests non-default may also be http, instead of https? Clients should always know and verify the server’s identity, irrespective of whether the server is local or external.

Do you mean non-default should be https?

xiguiw avatar Jan 06 '25 09:01 xiguiw

@dehatideep

We are starting to fix problems pointed out by you. Here are something I am not sured.

Would you please help for some details in ChatQnA security Assessment: https://docs.google.com/document/d/1df20UOmqJ_30VW5i6MajxXbJn3KhwVHfxYo3oGt2W5o/edit?usp=sharing?

ChatQnA/docker_compose/install_docker.sh /install_docker.sh

Is the intention of keeping it readable to all by design? If not it should be restricted to 600, despite the fact that it is a public trusted key.

I did not get your point. Would you please give some more details. Thanks! [Deep] You are bringing Docker public GPG key, so that Docker server could be verified and trusted when you add docker repo to source list. Later down the line you add user to 'docker' group. Note that all of it is being done as 'sudo', which suggests that access to key is needed by only root and some privileged user. This suggests that key permission could be 600 or 660 (if a particular group need to access it as well). This restricts the attack surface because any random user bringing sth from docker will fail server verification. THIS IS MERELY DEFENSIVE. Nothing is broken per se here because this key itself is public. -------- steps in the code ---------

Install prerequisites

sudo apt-get -y install ca-certificates curl

Create the directory for the Docker GPG key

sudo install -m 0755 -d /etc/apt/keyrings

Add Docker's official GPG key

sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc

Set permissions for the GPG key

sudo chmod a+r /etc/apt/keyrings/docker.asc

  1. ChatQnA/benchmark/accuracy/eval_crud.py

default="http://localhost:8888/v1/chatqna"

Default URL being http suggests non-default may also be http, instead of https? Clients should always know and verify the server’s identity, irrespective of whether the server is local or external.

Do you mean non-default should be https?

[Deep] If you have a rule (say an iptable firewall, it is not there though) where this port 8888 is accessible only internally (localhost/127.0.0.1), then it is fine but as it exists today, an external host, say 10.11.12.13 may call it like: http://10.11.12.13:8888/v1/chatqna and this opens the door for spoofing/MITM attack because connection is insecure. It should all be https, particularly if this port can be accessed by external machines. Basically you have to create an https server instead of http and that would require a TLS certificates, which you can generate locally. This would be a self-signed cert but at least transport pipe would be opaque.

dehatideep avatar Jan 06 '25 17:01 dehatideep

@dehatideep

Thank you for your explanation in detail!

I checked the installation of docker. It seems this script copied from docker website

Just keep it unchanged if there is no potential issue. Think twice, install docker script should not be part of the OPEA, I would like to remove it.

image

For the https interfaces. Yes, we agree with you. The problem is we cannot sign certificates.

OPEA is a reference design. We plan to put the https interface implementation to the customized implementation stage. Do you have any suggestions, comments for this? Thanks!

xiguiw avatar Jan 08 '25 07:01 xiguiw

https://github.com/opea-project/GenAIExamples/issues/1220#issuecomment-2576889386 Docker GPG Key a+r permission is fine. Recommendation was related to reducing the attack surface, given after Docker installation, no one should need to change/update the installation, except root and/or privileged user. This is fine though, given it is only read permission.

Coming to http vs https pipe, your ref design must have https IMO. You can always generate a self-signed certificate for the server with the advice that customized implementation must install their CA Signed cert installed on the server and make Signer cert available to users, just the way Docker public key is brought in the first case above. When server is running default self-signed server, clients can skip cert verification. Note that self-signed cert is also a problem, given it cant be verified, so client doesn't know if connection is spoofed, however it ensures that pipe is opaque to everyone else. I have no opinion if you want to defer it to customized implementation, just add comments somewhere, else people simply clone the repo and the issue remains buried there.

dehatideep avatar Jan 08 '25 16:01 dehatideep

@dehatideep Thank you for your suggestions!

I'll add some comments in README to keep everyone aware and be noticed about https secure connection here.

xiguiw avatar Jan 13 '25 02:01 xiguiw

Tagging @kdruckman for awareness

arun-gupta avatar Jan 13 '25 05:01 arun-gupta

@arun-gupta

GenAIExamples/ChatQnA/docker_compose/install_docker.sh This bash script installs docker on linux, it's copied from docker website.

I would like to remove it. Any comment?

xiguiw avatar Jan 17 '25 02:01 xiguiw

Remind @xiguiw please take care

yinghu5 avatar Mar 21 '25 10:03 yinghu5

#1220 (comment) Docker GPG Key a+r permission is fine. Recommendation was related to reducing the attack surface, given after Docker installation, no one should need to change/update the installation, except root and/or privileged user. This is fine though, given it is only read permission.

Coming to http vs https pipe, your ref design must have https IMO. You can always generate a self-signed certificate for the server with the advice that customized implementation must install their CA Signed cert installed on the server and make Signer cert available to users, just the way Docker public key is brought in the first case above. When server is running default self-signed server, clients can skip cert verification. Note that self-signed cert is also a problem, given it cant be verified, so client doesn't know if connection is spoofed, however it ensures that pipe is opaque to everyone else. I have no opinion if you want to defer it to customized implementation, just add comments somewhere, else people simply clone the repo and the issue remains buried there.

Our whole pipeline should be accessible only via HTTPS for security of prompt, data, response etc. And production systems need real certificate chains with a well-known Certificate Authority. So, this goes beyond evaluation. We could add a page/link for production best practices and give a pointer to the same in all our docs.

mkbhanda avatar May 12 '25 09:05 mkbhanda

@dehatideep

Thank you for your explanation in detail!

I checked the installation of docker. It seems this script copied from docker website

Just keep it unchanged if there is no potential issue. Think twice, install docker script should not be part of the OPEA, I would like to remove it.

image

For the https interfaces. Yes, we agree with you. The problem is we cannot sign certificates.

OPEA is a reference design. We plan to put the https interface implementation to the customized implementation stage. Do you have any suggestions, comments for this? Thanks!

In the documentation we can add a sentence, we assume docker has been installed correctly with the right degree of security. If necessary, please refer to

mkbhanda avatar May 12 '25 09:05 mkbhanda

Tagging @eero-t for awareness

mkbhanda avatar May 12 '25 14:05 mkbhanda

@poussa can comment more on certificate use on K8s side.

As to other items...

  • Neither host networking nor NICE capability are needed for Kubernetes, so I would assume them to be unnecessary also for Docker usage (IMHO latter belongs more to some "performance fine-tuning" documentation, than default spec, and needs some impact numbers to back it up)

  • align_* functions are callbacks for Orchestrator code in GenAIComps. That should be considered at the same time with applications' code in GenAIExamples

  • Benchmarking should be running in more limited environment with no access to outside, so I would think its risks more limited than for code which is exposed outside cluster / to actual users

  • Most of Kubernetes issues are already handled, only remaining items should be ones needing changes to code, not just to K8s specs:

    • UI running as root: https://github.com/opea-project/GenAIExamples/issues/517
    • Writable rootfs: https://github.com/opea-project/GenAIComps/issues?q=is%3Afeature%20author%3Alianhao

eero-t avatar May 12 '25 16:05 eero-t