radiobrowser-api
radiobrowser-api copied to clipboard
Normalize tag data model
I took a look at the SQL dump and found that there is a TagCache table listing the tags, but that it is not related to the Stations table.
The current database has tags just stored as a string delimited by commas:
(57613, 104897, 'Radio Progreso y ERIC - El Progreso', 'http://192.99.233.46:8900/;', 'http://radioprogresohn.net/', 'http://radioprogresohn.net/media/com_zo2framework/images/favicon.ico', '2017-12-02 04:26:29', 'Honduras', 'Yoro', 'Spanish', 'derechos humanos,jesuit,el progreso', 35, 0, '24.246.81.175', '9649a4df-0601-11e8-ae97-52543be04c81','9649a4db-0601-11e8-ae97-52543be04c81')
(here the tags are ["derechos humanos", "jesuit", "el progreso"])
and there's a /cache/ which I assume gets rebuilt manually somewhere:
DROP TABLE IF EXISTS `TagCache`;
/*!40101 SET @saved_cs_client = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `TagCache` (
`TagName` varchar(100) COLLATE utf8_bin NOT NULL,
`StationCount` int(11) DEFAULT '0',
`StationCountWorking` int(11) DEFAULT '0',
PRIMARY KEY (`TagName`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
/*!40101 SET character_set_client = @saved_cs_client */;
This situation is a many-to-many relationship and it would be cleaner to use an intermediate "junction" table:
CREATE TABLE `Station` (
`StationID` int(11) NOT NULL AUTO_INCREMENT,
-- ...
-- *no* `Tags` field here
-- ...
PRIMARY KEY (`StationID`)
)
CREATE TABLE `Tag` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(100) COLLATE utf8_bin NOT NULL,
PRIMARY KEY (`id`)
)
CREATE TABLE `Station_Tag` (
`StationID` int(11) NOT NULL,
`TagID` int(11) NOT NULL
FOREIGN KEY (StationID)
REFERENCES Station(id),
FOREIGN KEY (TagID)
REFERENCES Tag(id)
)
I can't find where in the code TagCache gets populated; is there a cronjob that's not checked into the repo or am I missing something obvious? In any case, you could keep the table working by rewriting it as a view, and if performance becomes a problem you could materialize it (unfortunately with MySQL you need to write this manually). I think this should do it, but I haven't tested it, so it's just to show the idea:
CREATE VIEW TagCache AS
SELECT
Tag.id as TagID,
Tag.name as name,
count(*) as StationCount,
count(Station.working = 1) as StationCountWorking
FROM
-- this inner select deals with Stations having two IDs: a UUID and a SQL ID column
(SELECT Station.id as id, StationCheck.CheckOK as working
FROM
Station
INNER JOIN
StationCheck
WHERE Station.StationUuid = StationCheck.StationUuid) Station
INNER JOIN Station_Tag where Station.id = Station_Tag.StationID
INNER JOIN Tag where Tag.id = Station_Tag.TagID
GROUP BY TagID
I would like to write a migration to implement this change. It would make #56 easier for me.
(this should also save a bunch of storage since it compresses tags to 4 byte integers instead of repeating their whole names in each station entry)