openwrt icon indicating copy to clipboard operation
openwrt copied to clipboard

realtek: clock driver for RTL83XX (v7)

Open plappermaul opened this issue 2 years ago • 21 comments

Tested on DGS-1210-20. Maybe far from perfect but I like to get feedback. Once applied we get the following new features:

root@OpenWrt:/# cat /sys/kernel/debug/clk/clk_summary
                       enable  prepare  protect                                duty
   clock                count    count    count        rate   accuracy phase  cycle
---------------------------------------------------------------------------------------------
 oscillator                 1        1        0    25000000          0     0  50000
    lxb_clk                 2        2        0   200000000          0     0  50000
    mem_clk                 0        0        0   300000000          0     0  50000
    cpu_clk                 0        0        0   500000000          0     0  50000
root@OpenWrt:/# dmesg | grep MHz
[    0.000000] Frequencies: CPU:500MHz LXB:200MHz MEM:300MHz

Plus: The clock driver already has initial code to reduce the RTL838X CPU clock down to 300MHz. This can be used by some cpufreq driver in the future.

plappermaul avatar Jul 29 '22 13:07 plappermaul

memo to myself: Reorder registration to match https://elixir.bootlin.com/linux/latest/source/drivers/clk/mediatek/clk-mt8192.c

plappermaul avatar Jul 29 '22 20:07 plappermaul

To me the clock driver looks self-contained, so why not make it a platform driver? Then you can have a device to utilize instead of working around it.

Also, I would prefer to get rid of mach manual setup, we have DT and with a proper clock driver thee is really no need for it.

robimarko avatar Jul 30 '22 10:07 robimarko

Regarding the platform driver we might have a chicken-egg-problem. We want to read CPU frequencies in setup.c to define mips_hpt_frequency. At this point not platform driver has been loaded yet. Earliest possibly driver init call starts later.

<<< NEEDED HERE >>>
[    0.000000] Frequencies: CPU:500MHz
[    0.000000] clocksource: MIPS: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041786 ns
[    0.000020] sched_clock: 32 bits at 250MHz, resolution 4ns, wraps every 8589934590ns
[    0.008796] Calibrating delay loop... 498.89 BogoMIPS (lpj=2494464)
[    0.075681] pid_max: default: 32768 minimum: 301
[    0.081154] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
[    0.089196] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
[    0.104132] dyndbg: Ignore empty _ddebug table in a CONFIG_DYNAMIC_DEBUG_CORE build

<<< EARLIEST POSSIBLE PLATFORM DRIVER LOAD >>>
[    0.112628] EARLY_INITCALL rtl83xx clock 
[    0.122426] PURE_INITCALL rtl83xx clock
[    0.127861] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
[    0.138785] futex hash table entries: 256 (order: -1, 3072 bytes, linear)
[    0.146459] pinctrl core: initialized pinctrl subsystem
[    0.152720] CORE_INITCALL rtl83xx clock
[    0.158666] NET: Registered protocol family 16
[    0.164300] POSTCORE_INITCALL rtl83xx clock
[    0.176521] ARCH_INITCALL rtl83xx clock
[    0.338857] clocksource: Switched to clocksource MIPS

For comparison look at corresponding mainline ath79 setup routines. It works with CLK_OF_DECLARE() and runs totally outside the clk drivers.

https://elixir.bootlin.com/linux/latest/source/arch/mips/ath79/setup.c#L263 https://elixir.bootlin.com/linux/latest/source/arch/mips/ath79/clock.c

Any ideas?

plappermaul avatar Jul 31 '22 07:07 plappermaul

What you need to compare against, is a MIPS_GENERIC based platform. The MIPS_GENERIC setup code will use the DT info to fetch the clock, and to initialise mips_hpt_frequency in a generic way. See https://elixir.bootlin.com/linux/latest/source/arch/mips/generic/init.c#L151

See for example ingenic/jz4770.dtsi, which has a a phandle referring to the ingenic,jz4770-cgu clk driver as CPU clock.

svanheule avatar Jul 31 '22 08:07 svanheule

The current PR works exactly this way. So if nobody has a good idea I will stay away from the platform driver.

plappermaul avatar Jul 31 '22 08:07 plappermaul

Right, CLK_OF_DECLARE() is the right thing to do, and is not a platform driver (just like on MIPS ingenic). Perhaps I need more coffee...

Still, we should be working towards compatibility with MIPS_GENERIC in my opinion, which was mostly the point I and @robimarko (IIUC) were trying to make. So if you want to show the platform clock frequencies like the SDK does, you should rather be doing this from the clk driver itself, and keep setup.c clean of extras. I don't think it is necessary to do this at all though, because with your patches this information can be accessed from debugfs.

svanheule avatar Jul 31 '22 08:07 svanheule

I got lost in platform driver and MIPS clock initialization routines. But now I think the framework should be fine.

Find attached a partly rewirtten PR. Please comment on platform and driver initialization routines.

plappermaul avatar Jul 31 '22 15:07 plappermaul

This is some seriously cool stuff! I am looking forward to overclocked switches with water-cooling, we shall also need support for LED effects from the port leds for the modder-scene ;-)

No, seriously, this has great potential for power saving, which is becoming ever more important these days.

I would not bother too much about MIPS_GENERIC. This ship has sailed for the RTL9300 without a MIPS timer interrupt and the way the RTL8390 and RTL9300 handle the VPEs.

You write RTL83xx clock driver, is there support for the RTL839x? My understanding is that there was somewhat an evolution from the RTL838x regarding the clocks towards the RTL839x. And the RTL93xx code looks quite different again.

For development purposes it would be interesting to be able to change the LX clock. How are peripherials going to be informed that there is a clock change, will this be possible at run-time?

From my side, the code is good to be merged, it seems to work so far, and I am sure it will spark lots of ideas.

ghost avatar Aug 02 '22 06:08 ghost

Off we go. PR v3 includes an overhaul of the clock driver and adds full RTL839x support.

Feel free to test & comment.

plappermaul avatar Aug 02 '22 19:08 plappermaul

If you get rid of the soc_info dependency, this driver will run without much further modification (platform specific headers) on a 5.19 kernel. Since there's no loss of functionality with the proposed changes, I really see no point in choosing not to stay compatible with the MIPS_GENERIC based upstream platform.

Dear Sander, the upstream platform is the upstream platform. This is not what we use on OpenWRT and it will not fly anyway for the RTL9300 and RTL839x. Asking someone to remove functionality to play your political games is absolutely not OK. At this point we do not even have 5.15 compatibility and we will likely not get there unless we get the functionality of e.g. the XGS PR in that is hanging there since some time. Your role here is to review the PR, not to impose your opinions. You do due diligence and then it gets merged. Birger

ghost avatar Aug 03 '22 08:08 ghost

@bkobl You got it a bit backwards, since @svanheule is one of the maintainers, his opinion pretty much equates to the project's opinion when it comes to Realtek target. You cant just chuck in a 30+ commit PR and then expect anybody sane to rubber stamp that.

robimarko avatar Aug 03 '22 08:08 robimarko

Aside from the discussion above I updated to PR v4

plappermaul avatar Aug 04 '22 07:08 plappermaul

PR v5 is ready for review. Changes since last version:

  • RTL838x assembler setters (can change memory clock)
  • allows overclocking
  • SRAM anti-overwrite memory check
  • DDR3 300 MHz minimum check
  • fully self contained (uses own header file)

plappermaul avatar Aug 07 '22 17:08 plappermaul

PR v6 updates:

  • allow speed ranges in DT
  • fix instability due to unfinished cache flushes
  • Finer speed steps for CPU on RTL839x (25MHz)

plappermaul avatar Aug 12 '22 18:08 plappermaul

PR v7 includes:

  • switch to OPP tables
  • enable cpufreq-dt + userspace governor

To sky-rocket over the 200 comments I once again expect tons of nit-picky feedback. Nevertheless it would be helpful if someone can really test this bunch of code.

plappermaul avatar Aug 16 '22 12:08 plappermaul

To be on the safe side: is wgetting https://patch-diff.githubusercontent.com/raw/openwrt/openwrt/pull/10351.patch the right way to get the v7? I've noticed with other PRs that that approach might not always yield the most recent revision. Thanks.

Other than that: anything specific that should be tested or verified?

Borromini avatar Aug 16 '22 18:08 Borromini

Looks good. URL is most recent version.

Regarding the test: Do whatever you like and how much time you can spare.

plappermaul avatar Aug 16 '22 18:08 plappermaul

So... I tried playing with the scaling governor :smile:. I haven't connected serial yet, so I can't see what's happening. Do see it locking up though - and then it reboots.

root@OpenWrt:~# cat /sys/bus/cpu/devices/cpu0/cpufreq/scaling_governor 
userspace
root@OpenWrt:~# cat /sys/devices/system/cpu/cpufreq/policy0/scaling_available_governors 
ondemand userspace powersave performance 
root@OpenWrt:~# echo 'ondemand' > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor

Am I approaching this the wrong way or is it not supposed to work yet?

Borromini avatar Aug 17 '22 12:08 Borromini

Thanks for testing and sorry for the inconvenience. Two questions:

  • What SOC are you testing on 838x oder 839x and what model?
  • Cann you set frequency manual? echo 450000 > scaling_setspeed

Output from DGS-1210-52 (839x):

root@OpenWrt:/sys/devices/system/cpu/cpufreq/policy0# echo ondemand > scaling_governor
root@OpenWrt:/sys/devices/system/cpu/cpufreq/policy0# cat scaling_cur_freq
525000
root@OpenWrt:/sys/devices/system/cpu/cpufreq/policy0# cat scaling_cur_freq
425000
root@OpenWrt:/sys/devices/system/cpu/cpufreq/policy0# cat scaling_cur_freq
425000
root@OpenWrt:/sys/devices/system/cpu/cpufreq/policy0# cat scaling_cur_freq
575000
root@OpenWrt:/sys/devices/system/cpu/cpufreq/policy0# cat scaling_cur_freq
475000

plappermaul avatar Aug 17 '22 12:08 plappermaul

Thanks for testing and sorry for the inconvenience.

Hey, no worries. That's what testing is for right.

Two questions: * What SOC are you testing on 838x oder 839x and what model? * Cann you set frequency manual? echo 450000 > scaling_setspeed

RTL838x (ZyXEL GS1900-8HP v1).

Setting it manually seems to work:

root@OpenWrt:/sys/devices/system/cpu/cpufreq/policy0# cat scaling_governor
userspace
root@OpenWrt:/sys/devices/system/cpu/cpufreq/policy0# cat scaling_cur_freq
500000
root@OpenWrt:/sys/devices/system/cpu/cpufreq/policy0# echo 450000 > scaling_setspeed
root@OpenWrt:/sys/devices/system/cpu/cpufreq/policy0# cat scaling_cur_freq
450000

I'll give it another shot with 'ondemand' later on when I got serial connected.

Borromini avatar Aug 17 '22 13:08 Borromini

I took a look at the ondemand governor. From my observation it switches clocks frequently and and as we need busy waiting loops we consume some CPU during switching. That results in continous system CPU usage (20%) on my RTL839X. I will dig into this.

Until then you could check two things:

  • CPU performance on lowest vs. highest frequency.
  • Wattage difference between lowest and highest frequency

Thanks in advance.

plappermaul avatar Aug 17 '22 19:08 plappermaul

To follow this PR :)

Neustradamus avatar Aug 18 '22 00:08 Neustradamus

Until then you could check two things:

  • CPU performance on lowest vs. highest frequency.

Any suggestions on how to do this? Iperf between clients with the switch in between?

  • Wattage difference between lowest and highest frequency

I have a cheap energy meter I can put in between, but it's unreliable in the lower range (10W or less) AFAIK. Short of professional equipment, I suppose there's no better way to measure this?

Borromini avatar Aug 18 '22 17:08 Borromini

v8 enhancements

  • improved 838x stability
  • improved 839x stability
  • include only required asm sources into build

plappermaul avatar Aug 19 '22 19:08 plappermaul

Finally I was able to run a very simple benchmark on a 839x. Copy /dev/zero via SCP (AES256-CTR) to external PC

Frequency   Grade       Transfer    Config
425 MHz     Minimum     1.26 MB/s   echo 425000 > scaling_setspeed
700 MHz     Default     1.85 MB/s   echo 700000 > scaling_setspeed
750 MHz     Overclock   1.95 MB/s   echo 750000 > scaling_setspeed
800 MHz     Overclock   2.04 MB/s   echo 800000 > scaling_setspeed

plappermaul avatar Aug 20 '22 06:08 plappermaul

Merged the platform init patch ahead of the other changes, since that's useful to have in any case.

396e190f0be7 realtek: more generic platform initialization

svanheule avatar Aug 20 '22 09:08 svanheule

On RTL8381M (GS110TPP v1) my results are a bit strange (gzip a 10MB file of random data):

@ 500 MHz
root@OpenWrt:/sys/devices/system/cpu/cpufreq/policy0# time cat /tmp/null.bin | gzip > /dev/null
real	0m 20.90s
user	0m 0.01s
sys	0m 0.17s
root@OpenWrt:/sys/devices/system/cpu/cpufreq/policy0# cat cpuinfo_cur_freq
500000

@ 400 MHz
root@OpenWrt:/sys/devices/system/cpu/cpufreq/policy0# echo 400000 > scaling_setspeed 
root@OpenWrt:/sys/devices/system/cpu/cpufreq/policy0# time cat /tmp/null.bin | gzip > /dev/null
real	0m 18.84s
user	0m 0.00s
sys	0m 0.12s

@ 350 MHz
root@OpenWrt:/sys/devices/system/cpu/cpufreq/policy0# echo 350000 > scaling_setspeed 
root@OpenWrt:/sys/devices/system/cpu/cpufreq/policy0# time cat /tmp/null.bin | gzip > /dev/null
real	0m 17.84s
user	0m 0.00s
sys	0m 0.18s

Not sure how to interpret this. Is wall time incorrect once we reclock the CPU? Is the relative time spent on overhead going down because the CPU's cycle take longer, causing the operation to virtually speed up?

When changing the CPU frequency, I also got the following notification on the first change:

[  198.500000] R4K timer is unstable due to CPU frequency change

All SoC, from RTL838x to RTL931x, have external timers that are driven by the Lexra bus clock, so we should be able to use those if we want accurate wall times.

svanheule avatar Aug 20 '22 12:08 svanheule

PR v9 contains a slightly modified final version of the clock driver. It will provide reading of all clocks on RTL838X and RTL839X but only allows changing the CPU clock on both device familys. MEM changes would require additional DTR/MCR/DLL programming and that is beyond of spare time for this project. LXB changing has only been tested rudimentary and has currently no use case.

plappermaul avatar Aug 24 '22 19:08 plappermaul

Merged with the following commits: 1efaad03bb20 realtek: add PLL DT binding includes 4850bd887c3a realtek: add RTL83XX clock driver 800d5fb3c6a1 realtek: add patch to enable new clock driver in kernel 5df36d484968 realtek: enable basic config for cpufreq framework 7c18aab6e051 realtek: activate clock driver for RTL838X/RTL839X targets 48f3746fe5a9 realtek: switch RTL838X/RTL839X DT to new clock driver

checkpatch.pl was still complaining about 4850bd887c3a ("realtek: add RTL83XX clock driver"), so I cleaned that up for you. I've also updated the commit message for commit 5df36d484968 ("realtek: enable basic config for cpufreq framework"), to explain the caveats of allowing CPU reclocking at this moment.

svanheule avatar Aug 28 '22 09:08 svanheule