注意：这是 Java Agent！
这是个可以提升新版 Minecraft 启动速度并解决了加载完后主菜单持续一段时间的 CPU 占用 100% 的问题。
对服务端和客户端都有效果。在原版、 Spigot（Paperspigot 已经有一个相关的选项可用 Paper.WorkerThreadCount）、Forge、Fabric上都有作用。
-DminecraftThreadPoolSize=2 -DminecraftBootstrapThreadPoolSize=1 -DminecraftMainThreadPoolSize=2 -javaagent:minecraft-thread-pool-agent-1.0.0-shaded.jar
These are system properties passed as -Doption=value JVM arguments.
Default vanilla pool size is clamp(processor_count - 1, 1, 7). processor_count 指的是线程数，这其中包括物理核心和超线程。
minecraftThreadPoolSize (any version): changes size of all pools. Parameters below will overwrite this value.
minecraftMainThreadPoolSize (1.16+): changes main thread pool size (used for loading resources).
minecraftBootstrapThreadPoolSize (1.16+): changed bootstrap thread pool size (used for rewriting and optimizing DataFixer types).
Information below is for 1.16 and was extracted while analyzing decompiled client code.
At the start, client or server creates two thread pools: Main and Bootstrap. There are other pools, but they are not important here.
Main thread pool is used to reload resource managers. During reload, a resource manager scans assets and resourcepacks, loads models ans textures, creates textures atlases and does many other related things.
Bootstrap thread pool is used to rewrite DataFixer types. Minecraft declares a set of Schemas, each Schema containing Types. I do not know exactly how it is intergrated into Minecraft, since their DataFixers library is overcomplicated and uses abstractions that are impossible to understand without mathematical background.
Anyway, there is a type rewrite stage, in which each declared type is rewritten according to "optimization" rule. I do not know whether it means "performance optimization" or some other kind of optimization, what matters is that this process creates a lot of garbage objects and consumes a lot of CPU (the code must be compiled by JVM and this garbage must be collected).
Most importand thing in these pools are their sizes. By default, these pools have size that is calculated using clamp(processor_count - 1, 1, 7) formula. processor_count is the result of calling Runtime.getRuntime().availableProcessors(). This method returns not physical processor count and not physical core count, but "logical" processor count, meaning if you have 4 cores with hyperthreading, it will return 8. Also, OS can limit code count that is available for process, and result of this method will reflect it, but this is not important here.
For example, on 4 cores/8 threads machine these pools will have size clamp(8 - 1, 1, 7) = 7, meaning there will be 14 threads total for these two executors.
This is very bad for performance: type rewriting and GC are CPU-bound tasks, meaning while they working, they will fully consume a CPU core. For contrast, IO-bound tasks do not consume that much CPU because they are frequently waiting for IO resource (file, socket, etc.) and CPU can do other tasks while IO-bound task is waiting.
If you have more CPU-bound tasks than physical cores, tasks will compete and work slower. Even your PC may start lagging, because there is less CPU time for OS-related tasks and other applications.
The answer to the question "Why since 1.14 Minecraft consumes too much CPU on start?", therefore, is: Minecraft creates to many threads to rewrite DataFixer types and load resources and does not consider physical core count, its other executors and the time needed for GC to do its work.
The solution is to reduce thread pool sizes to adequate numbers, 1-3 threads for each executor. I don't know whether type rewriting is important or not, but since Mojang decided not to wait for rewriting completion at startup and allowing player to load world or connect to a server immediately after resources are loaded, this might not be a issue.
Specifically, 1 thread for Bootstrap executor and 2 for Main work especially well on my setup. Resource loading theoretically may benefit from more threads, but you can test it yourself.
After pool sizes were reduced, startup time significantly increase and CPU load is heaviliy reduced.
To make my solution as universal as possible, I use a Java instrumentation framework and ASM library to manipulate bytecode of loaded classes at runtime. This allows me to not bother with creating and distributing patches (mods).
There is net.minecraft.Util class containing fields storing executors (thread pools). Executors are created using a private method that is called when this class is being initialized.
I add call to AgentRuntime.replacePool method before these private methods return a value, allowing me to replace these pools if needed. In this method I do trivial things like checking system properties and creating new instance of ForkJoinPool.
To detect Util class, I use some heuristics like "a class must contain at least X fields ans Y methods, all fields and methods must be static, there must be presents logger and ExecutorService fields". This allows me to not depend on concrete names and obfuscation maps.
To measure start time I apply another patch to net.minecraft.Minecraft class. This is needed because Minecraft client does not print to the logs start time (like server does).
I add a call to AgentRuntime.startProfiling into the constructor and a call to AgentRuntime.endProfiling to any private synthetic ()V methods that contain conditional jump opcode. There is a callback lambda that is called when resource manager finishes reloading, and that is the actual moment when Minecraft client becomes responsive. So, placing endProfiling call here seem reasonable.
endProfiling prints measured time into stdout:
[STDOUT]: Done initial reload of resource manager, 11.60 sec passed since start