trafficcontrol icon indicating copy to clipboard operation
trafficcontrol copied to clipboard

Add server indicator if a server is a cache

Open mitchell852 opened this issue 3 years ago • 7 comments
trafficstars

This Improvement request (usability, performance, tech debt, etc.) affects these Traffic Control components:

  • Traffic Ops
  • Traffic Portal

Current behavior:

ALL servers, caches and non-caches are stored in the servers table and fetched via GET /api/servers. There is no real way to discern between caches and non-caches in the response. You can however use GET /api/servers?type=EDGE or GET /api/servers?type=MID but types can be added (and have been in our environment) to satisfy different use cases so you could have an EDGE_FOO type if you want.

New behavior:

Add an indicator to a server object to indicate if it is a cache or not so you can do something like GET /api/servers?cache=true and you can also manipulate this value via the API and the UI.

Also, by adding this indicator, it would be easy to fix #4137

FYI: this problem is a result of overloading the server table and this proposed solution is a bit of a band-aid. the real solution would be to split the servers table into 2 tables (imo) like servers and caches but that is a much larger effort (from a db, api and UI perspective) and this band-aid could be the shorter-term solution that would facilitate the real solution.

mitchell852 avatar Sep 21 '22 14:09 mitchell852

The original plan with TPv2 was to artificially separate server types into cache and non-cache, but that turned out to be difficult and expensive, because what a "cache server" is varies depending on context and almost all of those various criteria cannot be directly filtered on using query string parameters available on the /servers endpoint. Essentially I'd have to make some caveat like "this is the cache servers table except/plus some servers that meet criteria x" and even then I'd have to do the filtering on every digest cycle on potentially tens of thousands of table rows.

tl;dr this overloading issue is straight-up getting in the way of UI improvements, it's not just an annoyance to developers.

ocket8888 avatar Sep 21 '22 15:09 ocket8888

well that all sounds great/hard but don't you think a simple flag on a server (cache=true/false) which could be set to true on migration if type=EDGE*/MID* and allowing the users to set the rest would go a long way (in the short-term) to letting users define what is/isn't a cache?

..but i guess that would mean, after the migration, caches would have to be determined by cache=true/false and ignore the server types altogether (EDGE*/MID* for example)

mitchell852 avatar Sep 21 '22 17:09 mitchell852

Server types should be eliminated entirely, IMO. Or at least cache servers should not have mutable, user-create-able types. You can fairly easily place an indicator on a server that shows it's a cache server with like 80% accuracy (wild guess). Being absolutely sure is probably a waste of time, because most of the remaining edge cases represent non-functioning configurations (but then why are they possible??) and historically our plan for dealing with such things is just assume nobody makes those particular mistakes.

A flag is possible, yes, and probably quite useful, but I'm saying that we cannot really do any better than that because of limitations of the data model (or lack thereof). And that's frustrating.

ocket8888 avatar Sep 21 '22 17:09 ocket8888

A flag is possible, yes, and probably quite useful, but I'm saying that we cannot really do any better than that because of limitations of the data model (or lack thereof). And that's frustrating.

yep, understood and completely agree.

mitchell852 avatar Sep 26 '22 14:09 mitchell852

Hey @mitchell852 , I'm interested in contributing to the issue and I'd like to learn more about it. I got here from the Good First Issue you mentioned. I am new to open-source, but have understood about 60-70% of things talked in this issue. To me it's seems to be a major issue and I wonder if I have relevent experience in contributing towards the project. I appreciate any information you can provide and look forward to contributing to the project. Thank you, Utkarsh Chourasia

jammutkarsh avatar Feb 28 '23 04:02 jammutkarsh

Hey @mitchell852 , I'm interested in contributing to the issue and I'd like to learn more about it. I got here from the Good First Issue you mentioned. I am new to open-source, but have understood about 60-70% of things talked in this issue. To me it's seems to be a major issue and I wonder if I have relevent experience in contributing towards the project. I appreciate any information you can provide and look forward to contributing to the project. Thank you, Utkarsh Chourasia

@JammUtkarsh - i doubt this is a "good first issue" issue. it touches a lot of things - database, api, ui - but you are more than welcome to take it on if you choose. Here are the other GFIs - https://github.com/apache/trafficcontrol/labels/good%20first%20issue

mitchell852 avatar Feb 28 '23 16:02 mitchell852

Hi @mitchell852, I I am interested to contribute to this issue and several other issues relating to Traffic Ops and Traffic Portal.

I have been trying to read the documentation and get started with the project by setting up the backend (i.e., Traffic Ops) and frontend (i.e., Traffic Portal) but I have been struggling for a week and have not been able to make much progress. Despite trying several ways to start the application, I have not been successful so far.

These are the ways I have tried:

  1. After following the instructions on https://traffic-control-cdn.readthedocs.io/en/latest/development/index.html# and cloning the repository, I executed source dev/atc.dev.sh and ran atc build and atc start with no errors on my Mac machine where Docker is installed. However, upon running atc ready, nothing is printed on the console and commands like atc ready --wait trafficops or atc ready --wait trafficportal seem to be waiting indefinitely. Additionally, attempting to access the Traffic Portal interface via https://localhost:60443/ or https://localhost:8080/ has not been successful. Despite multiple attempts at restarting the process, the issue persists.
  2. I attempted to build Traffic Control using the pkg command following the instructions provided in the link https://traffic-control-cdn.readthedocs.io/en/latest/development/building.html#build-using-pkg. To ensure I was in the correct directory, I cloned the repo and in $GOPATH/src/github.com/apache/trafficcontrol, I ran pkg -a. However, the only output I saw was "Building weasel." and it appeared to be stuck indefinitely. This was attempted on my Mac machine and I have tried restarting the process multiple times, but the issue persists.
  3. After following the instructions in https://traffic-control-cdn.readthedocs.io/en/latest/development/building.html#build-using-docker-compose, I ran the command docker-compose build with no errors. However, I am unable to access the Traffic Portal interface at https://localhost:60443/ or https://localhost:8080/. Despite restarting the process multiple times, the issue persists.
  4. After referring to https://traffic-control-cdn.readthedocs.io/en/latest/development/building.html#building-individual-components, I attempted to execute build/clean_build.sh. However, I encountered errors during the process , per the below.
==================================================
WORKSPACE: /private/tmp/go/src/github.com/apache/trafficcontrol
BUILD_NUMBER: 12425.72b2332e
RHEL_VERSION: el7
TC_VERSION: 7.1.0
--------------------------------------------------
-rw-r--r--  1 zhuzikun  staff  50950600 Apr  3 22:37 dist/apache-trafficcontrol-7.1.0.tar.gz
+ for project in '"$@"'
+ [[ 0 -eq 1 ]]
+ ./build/build.sh traffic_monitor
+ tee dist/build-traffic_monitor.log
/opt/homebrew/bin/realpath
-----  Building traffic_monitor ...
/opt/homebrew/bin/realpath
git is /opt/homebrew/bin/git
go is /opt/homebrew/bin/go
/private/tmp/go/src/github.com/apache/trafficcontrol/build/functions.sh: line 173: type: rpmbuild: not found
rpmbuild not found in PATH
Error on line 97 of traffic_monitor/build/build_rpm.sh
traffic_monitor failed: traffic_monitor/build/build_rpm.sh
The following subdirectories had errors:
   traffic_monitor
Error on line 110 of ./build/build.sh
+ exit_code=1
+ '[' 1 -ne 0 ']'
+ echo 'Error on line 77 of build/clean_build.sh'
Error on line 77 of build/clean_build.sh
+ cleanup
++ id -u
+ '[' 501 -eq 0 ']'
+ exit 1
  1. After installing CentOS Stream 9 on my Mac machine using Parallels, I followed the instructions in https://traffic-control-cdn.readthedocs.io/en/latest/development/traffic_portal.html#installing-the-traffic-portal-developer-environment, except for step 4.2 where I didn't modify api.base_url to point to my Traffic Ops API endpoint. After this, I was able to access https://localhost:60443/. I then attempted to follow the instructions in https://traffic-control-cdn.readthedocs.io/en/latest/development/traffic_ops.html#installing-the-developer-environment, but encountered an error while attempting step 4, which involved using the reset and upgrade commands of admin (see app/db/admin for usage) to set up the traffic_ops databases. The error message is shown below. Screen Shot 2023-04-03 at 22 58 01

I am new to open-source. I would greatly appreciate if you could advise me on how to get started. I have been thinking about how to contribute to this issue and I have a few ideas. Perhaps the initial task involves augmenting the server object in the Server ORM by including a new field named isCached (or cache for brevity). Next, the server table in the database has to be updated to reflect the inclusion of this new field, which would necessitate the implementation of migration logic as current Traffic Ops users (i.e., administrators) will be impacted by this change. Finally, both the GET and POST APIs will need to be modified to incorporate the new field in the server object.

Thank you for your time reading this and have a nice day!

zikunz avatar Apr 03 '23 15:04 zikunz