CircuitNet Question about DRC Violations Prediction & Question about Feature Extraction

Thank you very much for your work, which has been incredibly helpful to me. However, I encountered some questions while reading the paper (CircuitNet: An Open-Source Dataset for Machine Learning in VLSI CAD Applications With Improved Domain-Specific Evaluation Metric and Learning Strategies) and exploring the source code of the CircuitNet project.

Labels for Model Training
In the paper, the task is described as: Given a globally routed design, build a model to predict DRC violations after detailed routing. This problem is formulated as a DRC hotspot detection task, with a threshold to categorize DRC hotspot regions and non-hotspot regions. The mathematical formulation is given as: $g_{DRC}: X \in \mathbb{R}^{w \times h \times f} \rightarrow V_i \in \{0,1\}^{w \times h}$ However, upon examining the source code, I found that the labels used in the training process are continuous values, rather than binary (0-1) images.
Loss Calculation
In the source code of RouteNet model, the decoder's final layer uses a sigmoid activation function to map the model output to the range [0,1]. Meanwhile, the labels are normalized using min-max scaling. Is it appropriate to use Mean Squared Error (MSE) as the loss function in this case, given the difference in how the output and labels are processed?

Dec 23 '24 08:12 wwwwwly

In this implementation, the label is binary only in inference. Of course, binary labels can be used for training, and the performance is better, but in this case, the threshold for the model becomes fixed.
I don't understand your point. After min-max scaling, the labels are also in the range [0,1]

Dec 30 '24 02:12 apri0426

Thank you for your response. The previous issue has been temporarily resolved. However, when reading the feature extraction code (read.py), I encountered a few questions again. First, let me clarify that my major is Computer Science and Technology, so I don't have a deep understanding of digital integrated circuit back-end design. Therefore, the questions I ask might seem somewhat naive.

In read.py / class ReadInnovusOutput / read_place_def(), the regular expression \d+ is used to obtain the bounding box of pins in the place_pin_dict, but it does not account for the negative sign (-). However, after examining the DEF file, I noticed that the coordinates can potentially be negative.

According to the LEF/DEF 5.7 Language Reference, the syntax for the PINS section of the DEF file is as follows: + LAYER layerName [SPACING minSpacing | DESIGNRULEWIDTH effectiveWidth] pt pt, where the symbol pt is described as: Represents a point in the design. This value corresponds to a coordinate pair, such as x y. You must enclose a point within parentheses, with space between the parentheses and the coordinates. For example, RECT ( 1000 2000 ) ( 1500 400 ). Therefore, the value of place_pin_dict should be $[x_{\text{lower left}}, y_{\text{lower left}}, x_{\text{upper right}}, y_{\text{upper right}}]$. However, in read.py / class ReadInnovusOutput / get_RUDY(), the values are unpacked using pin_left, pin_right, pin_lower, pin_upper = self.place_pin_dict[cell_pin_pair[1]].

Feb 17 '25 07:02 wwwwwly

Thank you for your report, I think you are right, the obtained pin shape is wrong. Since it only affects the "PINS" section in the DEF, and the pin shape is much less important than pin position in this case, this issue doesn't make much difference to the resulted features.

Mar 02 '25 10:03 apri0426