-
Notifications
You must be signed in to change notification settings - Fork 14
Show chain of references in Ractor errors #935
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Show chain of references in Ractor errors #935
Conversation
d37bed4 to
89a62f7
Compare
vm.c
Outdated
| !RB_OBJ_SHAREABLE_P(block_self)) { | ||
| if (!rb_ractor_shareable_p_continue(block_self, chain)) { | ||
| if (chain) { | ||
| if (NIL_P(*chain)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we duplicating the chain_append logic here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know what's the best way to share code here. Should I make the function non-static, prefix it somehow and add it to "ractor_core.h"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved it to an inline function in ractor_core.h.
bootstraptest/test_ractor.rb
Outdated
| " from block self #<Foo @ivar={}>\n" \ | ||
| " from hash default value\n" \ | ||
| " from instance variable @ivar\n" \ | ||
| " from instance variable @foo", %q{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The chain approach makes sense to me, but I'm finding the error message a little hard to parse - it's not immediately obvious to me from the message what objects the ivars are attached to, or where the hash values are coming from. Is it worth using rb_inspect or rb_obj_as_string here or is the performance an issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a performance issue, it's more that the output gets really messy really fast.
Even in the simple case of just instance variables, it will cause a huge wall of text with a lot of repetition:
class A; attr_accessor :b; end
class B; attr_accessor :c; end
class C; attr_accessor :d; end
class D; attr_accessor :e; end
a = A.new
a.b = b = B.new
b.c = c = C.new
c.d = d = D.new
d.e = ->{}
Ractor.make_shareable a../../test.rb:12:in 'Ractor.make_shareable': Proc's self is not shareable: #<Proc:0x0000000103731150 ../../test.rb:10 (lambda)> (Ractor::IsolationError)
from block self main
from instance variable @e of #<D:0x0000000103731210 @e=#<Proc:0x0000000103731150 ../../test.rb:10 (lambda)>>
from instance variable @d of #<C:0x0000000103731300 @d=#<D:0x0000000103731210 @e=#<Proc:0x0000000103731150 ../../test.rb:10 (lambda)>>>
from instance variable @c of #<B:0x00000001037313f0 @c=#<C:0x0000000103731300 @d=#<D:0x0000000103731210 @e=#<Proc:0x0000000103731150 ../../test.rb:10 (lambda)>>>>
from instance variable @b of #<A:0x00000001037314b0 @b=#<B:0x00000001037313f0 @c=#<C:0x0000000103731300 @d=#<D:0x0000000103731210 @e=#<Proc:0x0000000103731150 ../../test.rb:10 (lambda)>>>>>
from ../../test.rb:12:in '<main>'My first approach always had the object under consideration in addition to the "reference" and it wasn't super readable.
Using this branch I've noticed I usually just need the last line anyway.
89a62f7 to
2e55ee7
Compare
[Feature #21846] There is a single path through our GC Sweeping code, and we always call rb_gc_obj_free_vm_weak_references and rb_gc_obj_free before adding the object back to the freelist. We do this even when the object has no external resources that require being free'd and has no weak references pointing to it. This commit introduces a conservative fast path through gc_sweep_plane that uses the object flags to identify certain cases where these calls can be skipped - for these objects we just add them straight back on the freelist. Any object for which gc_sweep_fast_path_p returns false will use the current full sweep code (referred to here as the slow path). Currently there are 2 checks that will _always_ require an object to go down the slow path: 1. Has it's object_id been observed and stored in the id2ref_table 2. Has it got generic ivars in the gen_fields table If neither of these are true, then we run some flag checks on the object and send the following cases down the fast path: - Objects that are not heap allocated - Embedded strings that aren't in the fstring table - Embedded Arrays - Embedded Hashes - Embedded Bignums - Embedded Strings - Floats, Rationals and Complex - Various IMEMO subtypes that do no allocation We've benchmarked this code using ruby-bench as well as the gcbench benchmarks inside Ruby (benchmarks/gc) and this patch results in a modest speed improvement on almost all of the headline benchmarks (2% in railsbench with YJIT enabled), and an observable 30% improvement in time spent sweeping during the GC benchmarks: ``` master: ruby 4.1.0dev (2026-01-19T12:03:33Z master 859920d) +YJIT +PRISM [x86_64-linux] experiment: ruby 4.1.0dev (2026-01-16T21:36:46Z mvh-sweep-fast-pat.. c3ffe37) +YJIT +PRISM [x86_64-linux] -------------- ----------- ---------- --------------- ---------- ------------------ ----------------- bench master (ms) stddev (%) experiment (ms) stddev (%) experiment 1st itr master/experiment lobsters N/A N/A N/A N/A N/A N/A activerecord 132.5 0.9 132.5 1.0 1.056 1.001 chunky-png 577.2 0.4 580.1 0.4 0.994 0.995 erubi-rails 902.9 0.2 894.3 0.2 1.040 1.010 hexapdf 1763.9 3.3 1760.6 3.7 1.027 1.002 liquid-c 56.9 0.6 56.7 1.4 1.004 1.003 liquid-compile 46.3 2.1 46.1 2.1 1.005 1.004 liquid-render 77.8 0.8 75.1 0.9 1.023 1.036 mail 114.7 0.4 113.0 1.4 1.054 1.015 psych-load 1635.4 1.4 1625.9 0.5 0.988 1.006 railsbench 1685.4 2.4 1650.1 2.0 0.989 1.021 rubocop 133.5 8.1 130.3 7.8 1.002 1.024 ruby-lsp 140.3 1.9 137.5 1.8 1.007 1.020 sequel 64.6 0.7 63.9 0.7 1.003 1.011 shipit 1196.2 4.3 1181.5 4.2 1.003 1.012 -------------- ----------- ---------- --------------- ---------- ------------------ ----------------- Legend: - experiment 1st itr: ratio of master/experiment time for the first benchmarking iteration. - master/experiment: ratio of master/experiment time. Higher is better for experiment. Above 1 represents a speedup. ``` ``` Benchmark │ Wall(B) Sweep(B) Mark(B) │ Wall(E) Sweep(E) Mark(E) │ Wall Δ Sweep Δ ───────────────┼─────────────────────────────────┼─────────────────────────────────┼────────────────── null │ 0.000s 1ms 4ms │ 0.000s 1ms 4ms │ 0% 0% hash1 │ 4.330s 875ms 46ms │ 3.960s 531ms 44ms │ +8.6% +39.3% hash2 │ 6.356s 243ms 988ms │ 6.298s 176ms 1.03s │ +0.9% +27.6% rdoc │ 37.337s 2.42s 1.09s │ 36.678s 2.11s 1.20s │ +1.8% +13.1% binary_trees │ 3.366s 426ms 252ms │ 3.082s 275ms 239ms │ +8.4% +35.4% ring │ 5.252s 14ms 2.47s │ 5.327s 12ms 2.43s │ -1.4% +14.3% redblack │ 2.966s 28ms 41ms │ 2.940s 21ms 38ms │ +0.9% +25.0% ───────────────┼─────────────────────────────────┼─────────────────────────────────┼────────────────── Legend: (B) = Baseline, (E) = Experiment, Δ = improvement (positive = faster) Wall = total wallclock, Sweep = GC sweeping time, Mark = GC marking time Times are median of 3 runs ``` These results are also borne out when YJIT is disabled: ``` master: ruby 4.1.0dev (2026-01-19T12:03:33Z master 859920d) +PRISM [x86_64-linux] experiment: ruby 4.1.0dev (2026-01-16T21:36:46Z mvh-sweep-fast-pat.. c3ffe37) +PRISM [x86_64-linux] -------------- ----------- ---------- --------------- ---------- ------------------ ----------------- bench master (ms) stddev (%) experiment (ms) stddev (%) experiment 1st itr master/experiment lobsters N/A N/A N/A N/A N/A N/A activerecord 389.6 0.3 377.5 0.3 1.032 1.032 chunky-png 1123.4 0.2 1109.2 0.2 1.013 1.013 erubi-rails 1754.3 0.1 1725.7 0.1 1.035 1.017 hexapdf 3346.5 0.9 3326.9 0.7 1.003 1.006 liquid-c 84.0 0.5 83.5 0.5 0.992 1.006 liquid-compile 74.0 1.5 73.5 1.4 1.011 1.008 liquid-render 199.9 0.4 199.6 0.4 1.000 1.002 mail 177.8 0.4 176.4 0.4 1.069 1.008 psych-load 2749.6 0.7 2777.0 0.0 0.980 0.990 railsbench 2983.0 1.0 2965.5 0.8 1.041 1.006 rubocop 228.8 1.0 227.5 1.2 1.015 1.005 ruby-lsp 221.8 0.9 216.1 0.8 1.011 1.026 sequel 89.1 0.5 89.1 1.8 1.005 1.000 shipit 2385.6 1.6 2371.8 1.0 1.002 1.006 -------------- ----------- ---------- --------------- ---------- ------------------ ----------------- Legend: - experiment 1st itr: ratio of master/experiment time for the first benchmarking iteration. - master/experiment: ratio of master/experiment time. Higher is better for experiment. Above 1 represents a speedup. ``` ``` Benchmark │ Wall(B) Sweep(B) Mark(B) │ Wall(E) Sweep(E) Mark(E) │ Wall Δ Sweep Δ ───────────────┼─────────────────────────────────┼─────────────────────────────────┼────────────────── null │ 0.000s 1ms 4ms │ 0.000s 1ms 3ms │ 0% 0% hash1 │ 4.349s 877ms 45ms │ 4.045s 532ms 44ms │ +7.0% +39.3% hash2 │ 6.575s 235ms 967ms │ 6.540s 181ms 1.04s │ +0.5% +23.0% rdoc │ 45.782s 2.23s 1.14s │ 44.925s 1.90s 1.01s │ +1.9% +15.0% binary_trees │ 6.433s 426ms 252ms │ 6.268s 278ms 240ms │ +2.6% +34.7% ring │ 6.584s 17ms 2.33s │ 6.738s 13ms 2.33s │ -2.3% +30.8% redblack │ 13.334s 31ms 42ms │ 13.296s 24ms 107ms │ +0.3% +22.6% ───────────────┼─────────────────────────────────┼─────────────────────────────────┼────────────────── Legend: (B) = Baseline, (E) = Experiment, Δ = improvement (positive = faster) Wall = total wallclock, Sweep = GC sweeping time, Mark = GC marking time Times are median of 3 runs ```
It relies too much on VM level concerns, such that it can't be built with modular GC enabled. We'll move it into the VM, and then expose it to the GC implementations so they can use it.
Most compilers will optimise this anyway
Continually locking a mutex m can lead to starvation if all other threads are on the waitq of m. See https://bugs.ruby-lang.org/issues/21840 for more details. Solution: When a thread `T1` wakes up `T2` during mutex unlock but `T1` or any other thread successfully acquires it before `T2`, then we record the `running_time` of the thread during mutex acquisition. Then during unlock, if that thread's running_time is less than the saved running time, we set it back to the saved time. Fixes [Bug #21840]
We would like to do type matching on the VRegId. Extracting the VRegID from a usize makes the code a bit easier to understand and refactor. MemBase uses a VReg, and there is also a VReg in Opnd. We should be sharing types between these two, so this is a step in the direction of sharing a type
Until we get our global register allocator, we need our HIR to be in 100% block-local SSA. Add a validator to enforce that.
The RDoc link format has changed so these are all broken links.
The RDoc link format has changed so these are all broken links. ruby/net-http@97fe6085c3
- T_BIGNUM may have fields via `#object_id`. - The T_DATA logic was inversed. If `dfree` is unset we don't need cleanup.
Since `on_sp` is emitted, it doesn't do a whole lot anymore. This leaves one incompatibility for code like `"x#$%"` Ripper confuses this for bare interpolation with a global, but `$%` is not a valid global name. Still, it emits two string tokens in such a case. It doesn't make sense for prism to work around this bug, so the affected files are added as excludes. Since the only usage of this method makes sense for testing in prism itself, the method is removed instead of deprecated. ruby/prism@31be379f98
f44c160 to
b4d2a43
Compare
…uby#15982) Don't reset `th->running_time_us` when unlocking from `mutex_free` or force unlocking during thread destruction. Follow-up to 994257a.
Closes: Shopify#862 Add dynamic dispatch for `invokesuperforward` instruction as a first step. Specialization like YJIT’s is not implemented yet and will be handled separately. ## Benchmark ### lobsters <details> <summary>before patch</summary> ``` Average of last 10, non-warmup iters: 654ms ***ZJIT: Printing ZJIT statistics on exit*** Top-20 not inlined C methods (59.5% of total 15,599,811): Hash#fetch: 3,185,110 (20.4%) Regexp#match?: 708,802 ( 4.5%) Hash#key?: 696,422 ( 4.5%) String#sub!: 489,840 ( 3.1%) Set#include?: 396,625 ( 2.5%) String#<<: 396,279 ( 2.5%) String#start_with?: 379,336 ( 2.4%) Hash#delete: 325,992 ( 2.1%) String.new: 307,248 ( 2.0%) Integer#===: 279,054 ( 1.8%) Symbol#end_with?: 255,539 ( 1.6%) Kernel#is_a?: 246,961 ( 1.6%) Process.clock_gettime: 221,588 ( 1.4%) Integer#>: 219,718 ( 1.4%) String#match?: 218,056 ( 1.4%) Integer#<=: 202,617 ( 1.3%) Time#to_i: 192,214 ( 1.2%) Time#subsec: 189,240 ( 1.2%) String#to_sym: 185,593 ( 1.2%) String#include?: 182,862 ( 1.2%) Top-20 calls to C functions from JIT code (83.7% of total 126,406,213): rb_vm_opt_send_without_block: 37,054,888 (29.3%) rb_vm_send: 10,068,319 ( 8.0%) rb_vm_env_write: 8,529,584 ( 6.7%) rb_hash_aref: 8,014,188 ( 6.3%) rb_zjit_writebarrier_check_immediate: 7,697,828 ( 6.1%) rb_vm_getinstancevariable: 5,954,987 ( 4.7%) rb_ivar_get_at_no_ractor_check: 4,759,191 ( 3.8%) rb_obj_is_kind_of: 3,722,656 ( 2.9%) rb_vm_invokesuper: 2,663,433 ( 2.1%) rb_hash_aset: 2,416,121 ( 1.9%) rb_vm_setinstancevariable: 2,355,463 ( 1.9%) rb_vm_opt_getconstant_path: 2,297,784 ( 1.8%) Hash#fetch: 1,779,524 ( 1.4%) fetch: 1,405,586 ( 1.1%) rb_vm_invokeblock: 1,385,970 ( 1.1%) rb_str_buf_append: 1,369,178 ( 1.1%) rb_ec_ary_new_from_values: 1,336,805 ( 1.1%) rb_class_allocate_instance: 1,281,590 ( 1.0%) rb_hash_new_with_size: 899,859 ( 0.7%) rb_vm_sendforward: 798,572 ( 0.6%) Top-2 not optimized method types for send (100.0% of total 4,889,764): iseq: 4,886,942 (99.9%) null: 2,822 ( 0.1%) Top-3 not optimized method types for send_without_block (100.0% of total 525,349): optimized_send: 478,875 (91.2%) null: 42,175 ( 8.0%) optimized_block_call: 4,299 ( 0.8%) Top-3 not optimized method types for super (100.0% of total 2,350,295): cfunc: 2,239,567 (95.3%) alias: 107,374 ( 4.6%) attrset: 3,354 ( 0.1%) Top-3 instructions with uncategorized fallback reason (100.0% of total 2,216,938): invokeblock: 1,385,970 (62.5%) sendforward: 798,572 (36.0%) opt_send_without_block: 32,396 ( 1.5%) Top-20 send fallback reasons (99.9% of total 51,971,182): send_without_block_polymorphic: 18,639,354 (35.9%) singleton_class_seen: 9,274,307 (17.8%) send_without_block_no_profiles: 7,217,551 (13.9%) send_not_optimized_method_type: 4,889,764 ( 9.4%) send_no_profiles: 2,882,604 ( 5.5%) super_not_optimized_method_type: 2,350,295 ( 4.5%) uncategorized: 2,216,938 ( 4.3%) one_or_more_complex_arg_pass: 1,543,405 ( 3.0%) send_without_block_megamorphic: 723,037 ( 1.4%) send_polymorphic: 544,570 ( 1.0%) send_without_block_not_optimized_method_type_optimized: 483,174 ( 0.9%) send_without_block_not_optimized_need_permission: 390,366 ( 0.8%) too_many_args_for_lir: 312,568 ( 0.6%) super_complex_args_pass: 111,053 ( 0.2%) super_target_complex_args_pass: 104,723 ( 0.2%) super_polymorphic: 87,851 ( 0.2%) argc_param_mismatch: 50,382 ( 0.1%) send_without_block_not_optimized_method_type: 42,175 ( 0.1%) obj_to_string_not_string: 34,861 ( 0.1%) send_without_block_direct_keyword_mismatch: 32,436 ( 0.1%) Top-4 setivar fallback reasons (100.0% of total 2,355,463): not_monomorphic: 2,132,748 (90.5%) not_t_object: 125,163 ( 5.3%) too_complex: 97,531 ( 4.1%) new_shape_needs_extension: 21 ( 0.0%) Top-2 getivar fallback reasons (100.0% of total 6,080,097): not_monomorphic: 5,808,527 (95.5%) too_complex: 271,570 ( 4.5%) Top-3 definedivar fallback reasons (100.0% of total 405,302): not_monomorphic: 397,150 (98.0%) too_complex: 5,122 ( 1.3%) not_t_object: 3,030 ( 0.7%) Top-6 invokeblock handler (100.0% of total 1,385,970): monomorphic_iseq: 688,147 (49.7%) polymorphic: 523,864 (37.8%) monomorphic_other: 106,268 ( 7.7%) monomorphic_ifunc: 55,505 ( 4.0%) megamorphic: 6,762 ( 0.5%) no_profiles: 5,424 ( 0.4%) Top-8 popular complex argument-parameter features not optimized (100.0% of total 1,850,659): param_forwardable: 685,936 (37.1%) param_block: 641,355 (34.7%) param_rest: 327,046 (17.7%) param_kwrest: 120,210 ( 6.5%) caller_kw_splat: 36,147 ( 2.0%) caller_splat: 34,029 ( 1.8%) caller_blockarg: 5,826 ( 0.3%) caller_kwarg: 110 ( 0.0%) Top-1 compile error reasons (100.0% of total 191,769): exception_handler: 191,769 (100.0%) Top-6 unhandled YARV insns (100.0% of total 89,278): invokesuperforward: 81,667 (91.5%) getconstant: 3,318 ( 3.7%) setblockparam: 2,837 ( 3.2%) checkmatch: 929 ( 1.0%) expandarray: 360 ( 0.4%) once: 167 ( 0.2%) Top-3 unhandled HIR insns (100.0% of total 236,976): throw: 198,481 (83.8%) invokebuiltin: 35,774 (15.1%) array_max: 2,721 ( 1.1%) Top-20 side exit reasons (100.0% of total 15,409,202): guard_type_failure: 6,871,609 (44.6%) guard_shape_failure: 6,854,409 (44.5%) block_param_proxy_not_iseq_or_ifunc: 1,008,346 ( 6.5%) unhandled_hir_insn: 236,976 ( 1.5%) compile_error: 191,769 ( 1.2%) unhandled_yarv_insn: 89,278 ( 0.6%) fixnum_mult_overflow: 50,739 ( 0.3%) block_param_proxy_modified: 28,119 ( 0.2%) patchpoint_stable_constant_names: 19,872 ( 0.1%) unhandled_newarray_send_pack: 14,481 ( 0.1%) unhandled_block_arg: 13,787 ( 0.1%) fixnum_lshift_overflow: 10,085 ( 0.1%) patchpoint_no_ep_escape: 7,815 ( 0.1%) expandarray_failure: 4,532 ( 0.0%) guard_super_method_entry: 4,475 ( 0.0%) patchpoint_method_redefined: 1,212 ( 0.0%) patchpoint_no_singleton_class: 1,130 ( 0.0%) obj_to_string_fallback: 275 ( 0.0%) guard_less_failure: 163 ( 0.0%) interrupt: 111 ( 0.0%) send_count: 152,221,918 dynamic_send_count: 51,971,182 (34.1%) optimized_send_count: 100,250,736 (65.9%) dynamic_setivar_count: 2,355,463 ( 1.5%) dynamic_getivar_count: 6,080,097 ( 4.0%) dynamic_definedivar_count: 405,302 ( 0.3%) iseq_optimized_send_count: 40,162,692 (26.4%) inline_cfunc_optimized_send_count: 40,296,415 (26.5%) inline_iseq_optimized_send_count: 3,344,046 ( 2.2%) non_variadic_cfunc_optimized_send_count: 8,915,909 ( 5.9%) variadic_cfunc_optimized_send_count: 7,531,674 ( 4.9%) compiled_iseq_count: 5,554 failed_iseq_count: 0 compile_time: 1,779ms profile_time: 13ms gc_time: 19ms invalidation_time: 248ms vm_write_pc_count: 133,179,978 vm_write_sp_count: 133,179,978 vm_write_locals_count: 129,160,863 vm_write_stack_count: 129,160,863 vm_write_to_parent_iseq_local_count: 693,262 vm_read_from_parent_iseq_local_count: 14,736,626 guard_type_count: 157,425,618 guard_type_exit_ratio: 4.4% guard_shape_count: 64,005,824 guard_shape_exit_ratio: 10.7% code_region_bytes: 29,147,136 zjit_alloc_bytes: 44,468,338 total_mem_bytes: 73,615,474 side_exit_count: 15,409,202 total_insn_count: 934,468,730 vm_insn_count: 166,726,703 zjit_insn_count: 767,742,027 ratio_in_zjit: 82.2% ``` </details> <details> <summary>after patch</summary> ``` Average of last 10, non-warmup iters: 648ms ***ZJIT: Printing ZJIT statistics on exit*** Top-20 not inlined C methods (59.5% of total 15,571,939): Hash#fetch: 3,185,114 (20.5%) Regexp#match?: 708,795 ( 4.6%) Hash#key?: 696,422 ( 4.5%) String#sub!: 489,841 ( 3.1%) Set#include?: 396,625 ( 2.5%) String#<<: 396,279 ( 2.5%) String#start_with?: 370,465 ( 2.4%) Hash#delete: 325,992 ( 2.1%) String.new: 307,248 ( 2.0%) Integer#===: 277,929 ( 1.8%) Symbol#end_with?: 255,540 ( 1.6%) Kernel#is_a?: 246,961 ( 1.6%) Process.clock_gettime: 221,588 ( 1.4%) Integer#>: 219,718 ( 1.4%) String#match?: 218,057 ( 1.4%) Integer#<=: 202,617 ( 1.3%) Time#to_i: 192,214 ( 1.2%) Time#subsec: 189,240 ( 1.2%) String#to_sym: 185,593 ( 1.2%) String#include?: 182,863 ( 1.2%) Top-20 calls to C functions from JIT code (83.7% of total 126,248,940): rb_vm_opt_send_without_block: 36,875,422 (29.2%) rb_vm_send: 10,068,311 ( 8.0%) rb_vm_env_write: 8,529,572 ( 6.8%) rb_hash_aref: 8,014,184 ( 6.3%) rb_zjit_writebarrier_check_immediate: 7,697,776 ( 6.1%) rb_vm_getinstancevariable: 5,934,206 ( 4.7%) rb_ivar_get_at_no_ractor_check: 4,759,185 ( 3.8%) rb_obj_is_kind_of: 3,745,913 ( 3.0%) rb_vm_invokesuper: 2,663,429 ( 2.1%) rb_hash_aset: 2,416,112 ( 1.9%) rb_vm_setinstancevariable: 2,361,107 ( 1.9%) rb_vm_opt_getconstant_path: 2,294,768 ( 1.8%) Hash#fetch: 1,779,524 ( 1.4%) fetch: 1,405,590 ( 1.1%) rb_vm_invokeblock: 1,385,975 ( 1.1%) rb_str_buf_append: 1,369,179 ( 1.1%) rb_ec_ary_new_from_values: 1,336,806 ( 1.1%) rb_class_allocate_instance: 1,281,533 ( 1.0%) rb_hash_new_with_size: 899,857 ( 0.7%) rb_vm_sendforward: 798,572 ( 0.6%) Top-2 not optimized method types for send (100.0% of total 4,889,758): iseq: 4,886,936 (99.9%) null: 2,822 ( 0.1%) Top-3 not optimized method types for send_without_block (100.0% of total 525,350): optimized_send: 478,875 (91.2%) null: 42,176 ( 8.0%) optimized_block_call: 4,299 ( 0.8%) Top-3 not optimized method types for super (100.0% of total 2,350,289): cfunc: 2,239,565 (95.3%) alias: 107,374 ( 4.6%) attrset: 3,350 ( 0.1%) Top-4 instructions with uncategorized fallback reason (100.0% of total 2,298,609): invokeblock: 1,385,975 (60.3%) sendforward: 798,572 (34.7%) invokesuperforward: 81,666 ( 3.6%) opt_send_without_block: 32,396 ( 1.4%) Top-20 send fallback reasons (99.9% of total 51,873,375): send_without_block_polymorphic: 18,540,291 (35.7%) singleton_class_seen: 9,210,394 (17.8%) send_without_block_no_profiles: 7,202,051 (13.9%) send_not_optimized_method_type: 4,889,758 ( 9.4%) send_no_profiles: 2,882,602 ( 5.6%) super_not_optimized_method_type: 2,350,289 ( 4.5%) uncategorized: 2,298,609 ( 4.4%) one_or_more_complex_arg_pass: 1,543,404 ( 3.0%) send_without_block_megamorphic: 723,037 ( 1.4%) send_polymorphic: 544,570 ( 1.0%) send_without_block_not_optimized_method_type_optimized: 483,174 ( 0.9%) send_without_block_not_optimized_need_permission: 389,384 ( 0.8%) too_many_args_for_lir: 312,568 ( 0.6%) super_complex_args_pass: 111,054 ( 0.2%) super_target_complex_args_pass: 104,723 ( 0.2%) super_polymorphic: 87,852 ( 0.2%) argc_param_mismatch: 50,382 ( 0.1%) send_without_block_not_optimized_method_type: 42,176 ( 0.1%) obj_to_string_not_string: 34,853 ( 0.1%) send_without_block_direct_keyword_mismatch: 32,436 ( 0.1%) Top-4 setivar fallback reasons (100.0% of total 2,361,107): not_monomorphic: 2,138,392 (90.6%) not_t_object: 125,163 ( 5.3%) too_complex: 97,531 ( 4.1%) new_shape_needs_extension: 21 ( 0.0%) Top-2 getivar fallback reasons (100.0% of total 6,059,319): not_monomorphic: 5,787,746 (95.5%) too_complex: 271,573 ( 4.5%) Top-3 definedivar fallback reasons (100.0% of total 405,302): not_monomorphic: 397,150 (98.0%) too_complex: 5,122 ( 1.3%) not_t_object: 3,030 ( 0.7%) Top-6 invokeblock handler (100.0% of total 1,385,975): monomorphic_iseq: 688,157 (49.7%) polymorphic: 523,861 (37.8%) monomorphic_other: 106,268 ( 7.7%) monomorphic_ifunc: 55,505 ( 4.0%) megamorphic: 6,760 ( 0.5%) no_profiles: 5,424 ( 0.4%) Top-8 popular complex argument-parameter features not optimized (100.0% of total 1,850,658): param_forwardable: 685,941 (37.1%) param_block: 641,355 (34.7%) param_rest: 327,046 (17.7%) param_kwrest: 120,209 ( 6.5%) caller_kw_splat: 36,147 ( 2.0%) caller_splat: 34,029 ( 1.8%) caller_blockarg: 5,821 ( 0.3%) caller_kwarg: 110 ( 0.0%) Top-1 compile error reasons (100.0% of total 191,769): exception_handler: 191,769 (100.0%) Top-5 unhandled YARV insns (100.0% of total 7,611): getconstant: 3,318 (43.6%) setblockparam: 2,837 (37.3%) checkmatch: 929 (12.2%) expandarray: 360 ( 4.7%) once: 167 ( 2.2%) Top-3 unhandled HIR insns (100.0% of total 236,976): throw: 198,481 (83.8%) invokebuiltin: 35,774 (15.1%) array_max: 2,721 ( 1.1%) Top-20 side exit reasons (100.0% of total 15,343,302): guard_type_failure: 6,886,972 (44.9%) guard_shape_failure: 6,854,835 (44.7%) block_param_proxy_not_iseq_or_ifunc: 1,008,346 ( 6.6%) unhandled_hir_insn: 236,976 ( 1.5%) compile_error: 191,769 ( 1.2%) fixnum_mult_overflow: 50,739 ( 0.3%) block_param_proxy_modified: 28,119 ( 0.2%) patchpoint_stable_constant_names: 19,858 ( 0.1%) unhandled_newarray_send_pack: 14,481 ( 0.1%) unhandled_block_arg: 13,787 ( 0.1%) fixnum_lshift_overflow: 10,085 ( 0.1%) patchpoint_no_ep_escape: 7,815 ( 0.1%) unhandled_yarv_insn: 7,611 ( 0.0%) expandarray_failure: 4,533 ( 0.0%) guard_super_method_entry: 4,475 ( 0.0%) patchpoint_method_redefined: 1,212 ( 0.0%) patchpoint_no_singleton_class: 1,130 ( 0.0%) obj_to_string_fallback: 275 ( 0.0%) guard_less_failure: 163 ( 0.0%) interrupt: 102 ( 0.0%) send_count: 152,019,764 dynamic_send_count: 51,873,375 (34.1%) optimized_send_count: 100,146,389 (65.9%) dynamic_setivar_count: 2,361,107 ( 1.6%) dynamic_getivar_count: 6,059,319 ( 4.0%) dynamic_definedivar_count: 405,302 ( 0.3%) iseq_optimized_send_count: 40,149,182 (26.4%) inline_cfunc_optimized_send_count: 40,168,875 (26.4%) inline_iseq_optimized_send_count: 3,408,619 ( 2.2%) non_variadic_cfunc_optimized_send_count: 8,896,927 ( 5.9%) variadic_cfunc_optimized_send_count: 7,522,786 ( 4.9%) compiled_iseq_count: 5,554 failed_iseq_count: 0 compile_time: 1,784ms profile_time: 13ms gc_time: 19ms invalidation_time: 261ms vm_write_pc_count: 133,027,580 vm_write_sp_count: 133,027,580 vm_write_locals_count: 129,024,228 vm_write_stack_count: 129,024,228 vm_write_to_parent_iseq_local_count: 693,264 vm_read_from_parent_iseq_local_count: 14,727,716 guard_type_count: 157,500,381 guard_type_exit_ratio: 4.4% guard_shape_count: 64,160,894 guard_shape_exit_ratio: 10.7% code_region_bytes: 29,196,288 zjit_alloc_bytes: 44,686,498 total_mem_bytes: 73,882,786 side_exit_count: 15,343,302 total_insn_count: 934,219,385 vm_insn_count: 167,485,651 zjit_insn_count: 766,733,734 ratio_in_zjit: 82.1% ``` </details> ### rails-bench <details> <summary>before patch</summary> ``` Average of last 10, non-warmup iters: 1146ms ***ZJIT: Printing ZJIT statistics on exit*** Top-20 not inlined C methods (52.4% of total 38,306,776): Hash#key?: 3,141,619 ( 8.2%) Regexp#match?: 2,420,225 ( 6.3%) Hash#fetch: 2,245,557 ( 5.9%) Integer#===: 1,098,163 ( 2.9%) Hash#delete: 1,014,375 ( 2.6%) Array#any?: 1,007,766 ( 2.6%) String.new: 1,004,713 ( 2.6%) String#b: 797,913 ( 2.1%) String#to_sym: 680,943 ( 1.8%) Array#all?: 650,132 ( 1.7%) Fiber.current: 649,003 ( 1.7%) Array#join: 641,038 ( 1.7%) Array#include?: 613,837 ( 1.6%) Kernel#Array: 610,311 ( 1.6%) String#<<: 606,240 ( 1.6%) Symbol#end_with?: 598,807 ( 1.6%) String#force_encoding: 593,535 ( 1.5%) Kernel#dup: 580,051 ( 1.5%) Array#[]: 562,360 ( 1.5%) Kernel#respond_to?: 550,441 ( 1.4%) Top-20 calls to C functions from JIT code (75.5% of total 262,197,810): rb_vm_opt_send_without_block: 54,534,682 (20.8%) rb_hash_aref: 22,920,285 ( 8.7%) rb_vm_env_write: 19,385,633 ( 7.4%) rb_vm_send: 17,070,477 ( 6.5%) rb_zjit_writebarrier_check_immediate: 13,780,973 ( 5.3%) rb_vm_getinstancevariable: 12,379,513 ( 4.7%) rb_ivar_get_at_no_ractor_check: 12,156,906 ( 4.6%) rb_vm_invokesuper: 8,086,665 ( 3.1%) rb_hash_aset: 5,043,536 ( 1.9%) rb_obj_is_kind_of: 4,431,123 ( 1.7%) rb_vm_invokeblock: 4,036,483 ( 1.5%) Hash#key?: 3,141,619 ( 1.2%) rb_vm_opt_getconstant_path: 3,053,319 ( 1.2%) rb_class_allocate_instance: 2,878,526 ( 1.1%) rb_hash_new_with_size: 2,823,745 ( 1.1%) rb_ec_ary_new_from_values: 2,585,553 ( 1.0%) rb_str_concat_literals: 2,450,764 ( 0.9%) Regexp#match?: 2,420,225 ( 0.9%) rb_obj_alloc: 2,419,171 ( 0.9%) rb_vm_setinstancevariable: 2,357,067 ( 0.9%) Top-2 not optimized method types for send (100.0% of total 8,550,760): iseq: 8,518,289 (99.6%) optimized: 32,471 ( 0.4%) Top-2 not optimized method types for send_without_block (100.0% of total 789,641): optimized_send: 606,885 (76.9%) null: 182,756 (23.1%) Top-2 not optimized method types for super (100.0% of total 6,689,859): cfunc: 6,640,180 (99.3%) attrset: 49,679 ( 0.7%) Top-3 instructions with uncategorized fallback reason (100.0% of total 5,962,039): invokeblock: 4,036,483 (67.7%) sendforward: 1,871,601 (31.4%) opt_send_without_block: 53,955 ( 0.9%) Top-20 send fallback reasons (100.0% of total 85,599,908): send_without_block_polymorphic: 31,804,276 (37.2%) send_without_block_no_profiles: 13,349,825 (15.6%) send_not_optimized_method_type: 8,550,760 (10.0%) super_not_optimized_method_type: 6,689,859 ( 7.8%) uncategorized: 5,962,039 ( 7.0%) send_no_profiles: 5,200,278 ( 6.1%) one_or_more_complex_arg_pass: 4,198,502 ( 4.9%) send_polymorphic: 3,318,658 ( 3.9%) send_without_block_not_optimized_need_permission: 1,274,177 ( 1.5%) too_many_args_for_lir: 1,139,487 ( 1.3%) singleton_class_seen: 1,101,973 ( 1.3%) super_complex_args_pass: 829,842 ( 1.0%) send_without_block_not_optimized_method_type_optimized: 606,885 ( 0.7%) send_without_block_megamorphic: 565,874 ( 0.7%) super_target_complex_args_pass: 414,600 ( 0.5%) send_without_block_not_optimized_method_type: 182,756 ( 0.2%) obj_to_string_not_string: 158,141 ( 0.2%) super_call_with_block: 100,004 ( 0.1%) send_without_block_direct_keyword_mismatch: 99,588 ( 0.1%) super_polymorphic: 52,360 ( 0.1%) Top-2 setivar fallback reasons (100.0% of total 2,357,067): not_monomorphic: 2,255,283 (95.7%) not_t_object: 101,784 ( 4.3%) Top-1 getivar fallback reasons (100.0% of total 12,379,538): not_monomorphic: 12,379,538 (100.0%) Top-2 definedivar fallback reasons (100.0% of total 350,548): not_monomorphic: 350,461 (100.0%) not_t_object: 87 ( 0.0%) Top-6 invokeblock handler (100.0% of total 4,036,483): monomorphic_iseq: 2,189,057 (54.2%) polymorphic: 1,207,002 (29.9%) monomorphic_other: 334,248 ( 8.3%) monomorphic_ifunc: 221,225 ( 5.5%) megamorphic: 84,439 ( 2.1%) no_profiles: 512 ( 0.0%) Top-8 popular complex argument-parameter features not optimized (100.0% of total 5,212,154): param_forwardable: 1,824,953 (35.0%) param_block: 1,792,214 (34.4%) param_rest: 861,894 (16.5%) caller_splat: 283,669 ( 5.4%) caller_kw_splat: 248,291 ( 4.8%) param_kwrest: 200,208 ( 3.8%) caller_blockarg: 752 ( 0.0%) caller_kwarg: 173 ( 0.0%) Top-1 compile error reasons (100.0% of total 391,562): exception_handler: 391,562 (100.0%) Top-6 unhandled YARV insns (100.0% of total 1,000,531): invokesuperforward: 498,993 (49.9%) getconstant: 400,945 (40.1%) expandarray: 49,985 ( 5.0%) setblockparam: 49,972 ( 5.0%) checkmatch: 480 ( 0.0%) once: 156 ( 0.0%) Top-2 unhandled HIR insns (100.0% of total 268,151): throw: 232,560 (86.7%) invokebuiltin: 35,591 (13.3%) Top-19 side exit reasons (100.0% of total 8,709,784): guard_shape_failure: 2,497,335 (28.7%) block_param_proxy_not_iseq_or_ifunc: 1,988,408 (22.8%) guard_type_failure: 1,722,007 (19.8%) unhandled_yarv_insn: 1,000,531 (11.5%) compile_error: 391,562 ( 4.5%) unhandled_newarray_send_pack: 298,017 ( 3.4%) unhandled_hir_insn: 268,151 ( 3.1%) patchpoint_method_redefined: 200,632 ( 2.3%) unhandled_block_arg: 151,295 ( 1.7%) block_param_proxy_modified: 124,245 ( 1.4%) guard_less_failure: 50,126 ( 0.6%) fixnum_lshift_overflow: 9,985 ( 0.1%) patchpoint_stable_constant_names: 6,350 ( 0.1%) fixnum_mult_overflow: 570 ( 0.0%) obj_to_string_fallback: 405 ( 0.0%) patchpoint_no_ep_escape: 109 ( 0.0%) interrupt: 42 ( 0.0%) guard_super_method_entry: 8 ( 0.0%) guard_greater_eq_failure: 6 ( 0.0%) send_count: 329,199,237 dynamic_send_count: 85,599,908 (26.0%) optimized_send_count: 243,599,329 (74.0%) dynamic_setivar_count: 2,357,067 ( 0.7%) dynamic_getivar_count: 12,379,538 ( 3.8%) dynamic_definedivar_count: 350,548 ( 0.1%) iseq_optimized_send_count: 93,946,576 (28.5%) inline_cfunc_optimized_send_count: 97,478,983 (29.6%) inline_iseq_optimized_send_count: 9,138,886 ( 2.8%) non_variadic_cfunc_optimized_send_count: 25,367,116 ( 7.7%) variadic_cfunc_optimized_send_count: 17,667,768 ( 5.4%) compiled_iseq_count: 2,888 failed_iseq_count: 0 compile_time: 876ms profile_time: 28ms gc_time: 6ms invalidation_time: 8ms vm_write_pc_count: 287,051,837 vm_write_sp_count: 287,051,837 vm_write_locals_count: 273,948,883 vm_write_stack_count: 273,948,883 vm_write_to_parent_iseq_local_count: 1,079,877 vm_read_from_parent_iseq_local_count: 30,814,984 guard_type_count: 310,888,965 guard_type_exit_ratio: 0.6% guard_shape_count: 108,669,058 guard_shape_exit_ratio: 2.3% code_region_bytes: 14,352,384 zjit_alloc_bytes: 18,992,674 total_mem_bytes: 33,345,058 side_exit_count: 8,709,784 total_insn_count: 1,705,856,454 vm_insn_count: 122,246,885 zjit_insn_count: 1,583,609,569 ratio_in_zjit: 92.8% ``` </details> <details> <summary>after patch</summary> ``` Average of last 10, non-warmup iters: 1072ms ***ZJIT: Printing ZJIT statistics on exit*** Top-20 not inlined C methods (52.5% of total 38,239,504): Hash#key?: 3,141,619 ( 8.2%) Regexp#match?: 2,420,215 ( 6.3%) Hash#fetch: 2,245,557 ( 5.9%) Integer#===: 1,097,515 ( 2.9%) Hash#delete: 1,014,375 ( 2.7%) Array#any?: 1,007,756 ( 2.6%) String.new: 1,004,713 ( 2.6%) String#b: 797,913 ( 2.1%) String#to_sym: 680,943 ( 1.8%) Array#all?: 650,132 ( 1.7%) Fiber.current: 649,003 ( 1.7%) Array#join: 641,038 ( 1.7%) Array#include?: 613,837 ( 1.6%) Kernel#Array: 610,311 ( 1.6%) String#<<: 606,240 ( 1.6%) Symbol#end_with?: 598,807 ( 1.6%) String#force_encoding: 593,535 ( 1.6%) Kernel#dup: 580,051 ( 1.5%) Array#[]: 562,360 ( 1.5%) Kernel#respond_to?: 550,441 ( 1.4%) Top-20 calls to C functions from JIT code (75.4% of total 262,218,592): rb_vm_opt_send_without_block: 54,249,429 (20.7%) rb_hash_aref: 22,920,271 ( 8.7%) rb_vm_env_write: 19,385,609 ( 7.4%) rb_vm_send: 17,070,463 ( 6.5%) rb_zjit_writebarrier_check_immediate: 13,780,893 ( 5.3%) rb_vm_getinstancevariable: 12,322,924 ( 4.7%) rb_ivar_get_at_no_ractor_check: 12,156,898 ( 4.6%) rb_vm_invokesuper: 8,086,659 ( 3.1%) rb_hash_aset: 5,043,532 ( 1.9%) rb_obj_is_kind_of: 4,474,826 ( 1.7%) rb_vm_invokeblock: 4,036,471 ( 1.5%) Hash#key?: 3,141,619 ( 1.2%) rb_vm_opt_getconstant_path: 3,053,286 ( 1.2%) rb_class_allocate_instance: 2,878,505 ( 1.1%) rb_hash_new_with_size: 2,823,748 ( 1.1%) rb_ec_ary_new_from_values: 2,585,561 ( 1.0%) rb_str_concat_literals: 2,450,756 ( 0.9%) Regexp#match?: 2,420,215 ( 0.9%) rb_obj_alloc: 2,419,146 ( 0.9%) rb_vm_setinstancevariable: 2,357,065 ( 0.9%) Top-2 not optimized method types for send (100.0% of total 8,550,755): iseq: 8,518,284 (99.6%) optimized: 32,471 ( 0.4%) Top-2 not optimized method types for send_without_block (100.0% of total 789,641): optimized_send: 606,885 (76.9%) null: 182,756 (23.1%) Top-2 not optimized method types for super (100.0% of total 6,689,853): cfunc: 6,640,178 (99.3%) attrset: 49,675 ( 0.7%) Top-4 instructions with uncategorized fallback reason (100.0% of total 6,461,020): invokeblock: 4,036,471 (62.5%) sendforward: 1,871,601 (29.0%) invokesuperforward: 498,993 ( 7.7%) opt_send_without_block: 53,955 ( 0.8%) Top-20 send fallback reasons (100.0% of total 85,813,616): send_without_block_polymorphic: 31,519,543 (36.7%) send_without_block_no_profiles: 13,349,751 (15.6%) send_not_optimized_method_type: 8,550,755 (10.0%) super_not_optimized_method_type: 6,689,853 ( 7.8%) uncategorized: 6,461,020 ( 7.5%) send_no_profiles: 5,200,273 ( 6.1%) one_or_more_complex_arg_pass: 4,198,498 ( 4.9%) send_polymorphic: 3,318,658 ( 3.9%) send_without_block_not_optimized_need_permission: 1,273,739 ( 1.5%) too_many_args_for_lir: 1,139,487 ( 1.3%) singleton_class_seen: 1,101,973 ( 1.3%) super_complex_args_pass: 829,842 ( 1.0%) send_without_block_not_optimized_method_type_optimized: 606,885 ( 0.7%) send_without_block_megamorphic: 565,874 ( 0.7%) super_target_complex_args_pass: 414,600 ( 0.5%) send_without_block_not_optimized_method_type: 182,756 ( 0.2%) obj_to_string_not_string: 158,133 ( 0.2%) super_call_with_block: 100,004 ( 0.1%) send_without_block_direct_keyword_mismatch: 99,588 ( 0.1%) super_polymorphic: 52,360 ( 0.1%) Top-2 setivar fallback reasons (100.0% of total 2,357,065): not_monomorphic: 2,255,281 (95.7%) not_t_object: 101,784 ( 4.3%) Top-1 getivar fallback reasons (100.0% of total 12,322,949): not_monomorphic: 12,322,949 (100.0%) Top-2 definedivar fallback reasons (100.0% of total 350,548): not_monomorphic: 350,461 (100.0%) not_t_object: 87 ( 0.0%) Top-6 invokeblock handler (100.0% of total 4,036,471): monomorphic_iseq: 2,189,045 (54.2%) polymorphic: 1,207,002 (29.9%) monomorphic_other: 334,248 ( 8.3%) monomorphic_ifunc: 221,225 ( 5.5%) megamorphic: 84,439 ( 2.1%) no_profiles: 512 ( 0.0%) Top-8 popular complex argument-parameter features not optimized (100.0% of total 5,212,150): param_forwardable: 1,824,953 (35.0%) param_block: 1,792,214 (34.4%) param_rest: 861,894 (16.5%) caller_splat: 283,669 ( 5.4%) caller_kw_splat: 248,291 ( 4.8%) param_kwrest: 200,208 ( 3.8%) caller_blockarg: 748 ( 0.0%) caller_kwarg: 173 ( 0.0%) Top-1 compile error reasons (100.0% of total 391,562): exception_handler: 391,562 (100.0%) Top-5 unhandled YARV insns (100.0% of total 501,538): getconstant: 400,945 (79.9%) expandarray: 49,985 (10.0%) setblockparam: 49,972 (10.0%) checkmatch: 480 ( 0.1%) once: 156 ( 0.0%) Top-2 unhandled HIR insns (100.0% of total 268,152): throw: 232,560 (86.7%) invokebuiltin: 35,592 (13.3%) Top-19 side exit reasons (100.0% of total 8,210,699): guard_shape_failure: 2,497,552 (30.4%) block_param_proxy_not_iseq_or_ifunc: 1,988,408 (24.2%) guard_type_failure: 1,721,809 (21.0%) unhandled_yarv_insn: 501,538 ( 6.1%) compile_error: 391,562 ( 4.8%) unhandled_newarray_send_pack: 298,017 ( 3.6%) unhandled_hir_insn: 268,152 ( 3.3%) patchpoint_method_redefined: 200,632 ( 2.4%) unhandled_block_arg: 151,295 ( 1.8%) block_param_proxy_modified: 124,245 ( 1.5%) guard_less_failure: 50,033 ( 0.6%) fixnum_lshift_overflow: 9,985 ( 0.1%) patchpoint_stable_constant_names: 6,342 ( 0.1%) fixnum_mult_overflow: 570 ( 0.0%) obj_to_string_fallback: 405 ( 0.0%) patchpoint_no_ep_escape: 109 ( 0.0%) interrupt: 31 ( 0.0%) guard_super_method_entry: 8 ( 0.0%) guard_greater_eq_failure: 6 ( 0.0%) send_count: 328,805,013 dynamic_send_count: 85,813,616 (26.1%) optimized_send_count: 242,991,397 (73.9%) dynamic_setivar_count: 2,357,065 ( 0.7%) dynamic_getivar_count: 12,322,949 ( 3.7%) dynamic_definedivar_count: 350,548 ( 0.1%) iseq_optimized_send_count: 93,990,621 (28.6%) inline_cfunc_optimized_send_count: 96,851,696 (29.5%) inline_iseq_optimized_send_count: 9,181,467 ( 2.8%) non_variadic_cfunc_optimized_send_count: 25,304,458 ( 7.7%) variadic_cfunc_optimized_send_count: 17,663,155 ( 5.4%) compiled_iseq_count: 2,886 failed_iseq_count: 0 compile_time: 875ms profile_time: 27ms gc_time: 66ms invalidation_time: 9ms vm_write_pc_count: 287,186,308 vm_write_sp_count: 287,186,308 vm_write_locals_count: 274,139,228 vm_write_stack_count: 274,139,228 vm_write_to_parent_iseq_local_count: 1,079,877 vm_read_from_parent_iseq_local_count: 30,810,378 guard_type_count: 310,644,961 guard_type_exit_ratio: 0.6% guard_shape_count: 109,072,242 guard_shape_exit_ratio: 2.3% code_region_bytes: 14,352,384 zjit_alloc_bytes: 19,186,174 total_mem_bytes: 33,538,558 side_exit_count: 8,210,699 total_insn_count: 1,705,193,555 vm_insn_count: 123,691,343 zjit_insn_count: 1,581,502,212 ratio_in_zjit: 92.7% ``` </details>
Also, include the column in here. Hopefully we can do some additional optimizations later. ruby/prism@7759acdd26
* Enable double-quoted options with an `=` sign. * Replace `$` with `$$` in the batch file without CPP. * Support for `--with-destdir`. * Allow Makefile macro definition. (Close rubyGH-15935)
Also consider paths with space at splitting the `--with-opt-dir` argument.
Previously, Visual C++ had only one toolchain for the x86 family, and the only option was to select the target processor level. In recent versions, there are multiple toolchains with the same command name for each host/target platform combination, so it is no longer possible to select the target with a command-line option. Also, configure.bat assumes that the toolchain has been configured before it is executed, so selecting it from this batch file is meaningless. Therefore, the only possible check is whether the specified target and compiler match.
Avoids an issue where `%undefined:A=B%` expands to a literal `A=B` because the parser fails to find the variable before the colon, then parses the following percent as the next variable expansion. Added a definition check to ensure safe expansion.
47d9a53 to
ee0ecd3
Compare
Resolves Shopify#772 Adds profiling for the `getblockparamproxy` YARV instruction and handles the `nil` block case by pushing `nil` instead of the block proxy object, improves `ratio_in_zjit` a tiny bit (0.1%) Profiling data for `getblockparamproxy` on Lobsters ``` Top-6 getblockparamproxy handler (100.0% of total 3,353,291): polymorphic: 2,337,372 (69.7%) nil: 552,629 (16.5%) iseq: 259,636 ( 7.7%) no_profiles: 156,734 ( 4.7%) proc: 40,223 ( 1.2%) megamorphic: 6,697 ( 0.2%) ``` Lobsters benchmark stats: <details> <summary>Stats before (master):</summary> <p> ``` ❯ ./run_benchmarks.rb --chruby 'ruby-zjit --zjit-stats' lobsters ***ZJIT: Printing ZJIT statistics on exit*** ... Top-20 side exit reasons (100.0% of total 15,338,024): guard_type_failure: 6,889,050 (44.9%) guard_shape_failure: 6,848,898 (44.7%) block_param_proxy_not_iseq_or_ifunc: 1,008,525 ( 6.6%) unhandled_hir_insn: 236,977 ( 1.5%) compile_error: 191,763 ( 1.3%) fixnum_mult_overflow: 50,739 ( 0.3%) block_param_proxy_modified: 28,119 ( 0.2%) patchpoint_stable_constant_names: 18,229 ( 0.1%) unhandled_newarray_send_pack: 14,481 ( 0.1%) unhandled_block_arg: 13,782 ( 0.1%) fixnum_lshift_overflow: 10,085 ( 0.1%) patchpoint_no_ep_escape: 7,815 ( 0.1%) unhandled_yarv_insn: 7,540 ( 0.0%) expandarray_failure: 4,533 ( 0.0%) guard_super_method_entry: 4,475 ( 0.0%) patchpoint_method_redefined: 1,207 ( 0.0%) patchpoint_no_singleton_class: 1,130 ( 0.0%) obj_to_string_fallback: 412 ( 0.0%) guard_less_failure: 163 ( 0.0%) interrupt: 82 ( 0.0%) ... ratio_in_zjit: 82.1% ``` </p> </details> <details> <summary>Stats after:</summary> <p> ``` ❯ ./run_benchmarks.rb --chruby 'ruby-zjit --zjit-stats' lobsters ***ZJIT: Printing ZJIT statistics on exit*** ... Top-20 side exit reasons (100.0% of total 15,061,422): guard_type_failure: 6,892,934 (45.8%) guard_shape_failure: 6,850,512 (45.5%) block_param_proxy_not_iseq_or_ifunc: 549,823 ( 3.7%) unhandled_hir_insn: 236,979 ( 1.6%) compile_error: 191,782 ( 1.3%) unhandled_yarv_insn: 128,695 ( 0.9%) block_param_proxy_not_nil: 68,623 ( 0.5%) fixnum_mult_overflow: 50,739 ( 0.3%) patchpoint_stable_constant_names: 18,568 ( 0.1%) unhandled_newarray_send_pack: 14,481 ( 0.1%) block_param_proxy_modified: 13,819 ( 0.1%) unhandled_block_arg: 13,798 ( 0.1%) fixnum_lshift_overflow: 10,085 ( 0.1%) patchpoint_no_ep_escape: 7,815 ( 0.1%) expandarray_failure: 4,533 ( 0.0%) guard_super_method_entry: 4,475 ( 0.0%) patchpoint_method_redefined: 1,207 ( 0.0%) obj_to_string_fallback: 1,140 ( 0.0%) patchpoint_no_singleton_class: 1,130 ( 0.0%) guard_less_failure: 163 ( 0.0%) ... ratio_in_zjit: 82.2% ``` </p> </details>
Break out the different cases into different blocks in the bytecode to HIR parser. Use a `RefineType` to plumb the case's type through so the type specialization can see it. Then join the logic back to the rest of the current block after each case's send.
lobsters before
<details>
```
***ZJIT: Printing ZJIT statistics on exit***
Top-20 not inlined C methods (58.7% of total 4,476,259):
Hash#fetch: 849,219 (19.0%)
String#start_with?: 328,017 ( 7.3%)
Regexp#match?: 148,149 ( 3.3%)
Hash#key?: 135,034 ( 3.0%)
Kernel#is_a?: 110,030 ( 2.5%)
Set#include?: 97,934 ( 2.2%)
Integer#===: 96,952 ( 2.2%)
Process.clock_gettime: 92,795 ( 2.1%)
String#sub!: 84,940 ( 1.9%)
String.new: 80,730 ( 1.8%)
SQLite3::Statement#done?: 73,532 ( 1.6%)
SQLite3::Statement#step: 73,532 ( 1.6%)
Time#plus_without_duration: 66,724 ( 1.5%)
String#<<: 63,954 ( 1.4%)
Time#to_i: 60,817 ( 1.4%)
Hash#delete: 60,664 ( 1.4%)
Time#subsec: 60,363 ( 1.3%)
String#hash: 51,261 ( 1.1%)
IO#read: 47,753 ( 1.1%)
String#to_sym: 43,915 ( 1.0%)
Top-20 calls to C functions from JIT code (83.7% of total 35,570,418):
rb_vm_opt_send_without_block: 10,516,746 (29.6%)
rb_vm_env_write: 2,382,117 ( 6.7%)
rb_zjit_writebarrier_check_immediate: 2,241,285 ( 6.3%)
rb_hash_aref: 2,189,588 ( 6.2%)
rb_vm_getinstancevariable: 1,762,596 ( 5.0%)
rb_ivar_get_at_no_ractor_check: 1,702,246 ( 4.8%)
rb_vm_send: 1,460,754 ( 4.1%)
rb_hash_aset: 1,151,302 ( 3.2%)
rb_vm_setinstancevariable: 1,029,286 ( 2.9%)
rb_obj_is_kind_of: 1,000,979 ( 2.8%)
rb_vm_opt_getconstant_path: 623,490 ( 1.8%)
rb_vm_invokesuper: 595,831 ( 1.7%)
Hash#fetch: 562,212 ( 1.6%)
rb_vm_invokeblock: 545,744 ( 1.5%)
rb_class_allocate_instance: 422,454 ( 1.2%)
rb_ec_ary_new_from_values: 388,035 ( 1.1%)
String#start_with?: 328,017 ( 0.9%)
rb_hash_new_with_size: 289,130 ( 0.8%)
fetch: 287,007 ( 0.8%)
rb_vm_sendforward: 284,183 ( 0.8%)
Top-1 not optimized method types for send (100.0% of total 428):
null: 428 (100.0%)
Top-3 not optimized method types for send_without_block (100.0% of total 102,413):
optimized_send: 92,837 (90.6%)
null: 8,595 ( 8.4%)
optimized_block_call: 981 ( 1.0%)
Top-3 not optimized method types for super (100.0% of total 517,931):
cfunc: 489,746 (94.6%)
alias: 26,398 ( 5.1%)
attrset: 1,787 ( 0.3%)
Top-4 instructions with uncategorized fallback reason (100.0% of total 868,223):
invokeblock: 545,744 (62.9%)
sendforward: 284,183 (32.7%)
invokesuperforward: 29,713 ( 3.4%)
opt_send_without_block: 8,583 ( 1.0%)
Top-20 send fallback reasons (100.0% of total 13,432,971):
send_without_block_polymorphic: 4,825,641 (35.9%)
singleton_class_seen: 3,257,447 (24.2%)
send_without_block_no_profiles: 1,906,060 (14.2%)
uncategorized: 868,223 ( 6.5%)
send_no_profiles: 806,168 ( 6.0%)
one_or_more_complex_arg_pass: 537,965 ( 4.0%)
super_not_optimized_method_type: 517,931 ( 3.9%)
send_without_block_megamorphic: 158,893 ( 1.2%)
too_many_args_for_lir: 127,160 ( 0.9%)
send_polymorphic: 112,628 ( 0.8%)
send_without_block_not_optimized_need_permission: 100,041 ( 0.7%)
send_without_block_not_optimized_method_type_optimized: 93,818 ( 0.7%)
super_complex_args_pass: 34,022 ( 0.3%)
super_target_complex_args_pass: 25,536 ( 0.2%)
super_polymorphic: 16,853 ( 0.1%)
obj_to_string_not_string: 13,794 ( 0.1%)
argc_param_mismatch: 9,927 ( 0.1%)
send_without_block_not_optimized_method_type: 8,595 ( 0.1%)
send_without_block_direct_keyword_mismatch: 5,568 ( 0.0%)
send_megamorphic: 4,525 ( 0.0%)
Top-4 setivar fallback reasons (100.0% of total 1,029,286):
not_monomorphic: 992,723 (96.4%)
not_t_object: 21,354 ( 2.1%)
too_complex: 15,188 ( 1.5%)
new_shape_needs_extension: 21 ( 0.0%)
Top-2 getivar fallback reasons (100.0% of total 1,790,794):
not_monomorphic: 1,750,108 (97.7%)
too_complex: 40,686 ( 2.3%)
Top-3 definedivar fallback reasons (100.0% of total 81,713):
not_monomorphic: 80,197 (98.1%)
too_complex: 796 ( 1.0%)
not_t_object: 720 ( 0.9%)
Top-6 invokeblock handler (100.0% of total 545,744):
monomorphic_iseq: 249,809 (45.8%)
polymorphic: 217,915 (39.9%)
monomorphic_ifunc: 46,244 ( 8.5%)
monomorphic_other: 27,938 ( 5.1%)
megamorphic: 2,943 ( 0.5%)
no_profiles: 895 ( 0.2%)
Top-8 popular complex argument-parameter features not optimized (100.0% of total 652,565):
param_forwardable: 246,421 (37.8%)
param_block: 198,808 (30.5%)
param_rest: 101,529 (15.6%)
param_kwrest: 44,809 ( 6.9%)
caller_blockarg: 24,596 ( 3.8%)
caller_splat: 15,969 ( 2.4%)
caller_kw_splat: 14,227 ( 2.2%)
caller_kwarg: 6,206 ( 1.0%)
Top-1 compile error reasons (100.0% of total 38,981):
exception_handler: 38,981 (100.0%)
Top-5 unhandled YARV insns (100.0% of total 4,154):
getconstant: 2,566 (61.8%)
checkmatch: 929 (22.4%)
setblockparam: 443 (10.7%)
once: 171 ( 4.1%)
expandarray: 45 ( 1.1%)
Top-3 unhandled HIR insns (100.0% of total 75,904):
throw: 39,721 (52.3%)
invokebuiltin: 35,772 (47.1%)
array_max: 411 ( 0.5%)
Top-20 side exit reasons (100.0% of total 3,770,125):
guard_shape_failure: 1,927,218 (51.1%)
guard_type_failure: 1,395,315 (37.0%)
block_param_proxy_not_iseq_or_ifunc: 257,894 ( 6.8%)
unhandled_hir_insn: 75,904 ( 2.0%)
compile_error: 38,981 ( 1.0%)
patchpoint_stable_constant_names: 25,375 ( 0.7%)
block_param_proxy_modified: 13,713 ( 0.4%)
fixnum_lshift_overflow: 10,085 ( 0.3%)
fixnum_mult_overflow: 8,550 ( 0.2%)
unhandled_yarv_insn: 4,154 ( 0.1%)
unhandled_block_arg: 2,548 ( 0.1%)
unhandled_newarray_send_pack: 2,322 ( 0.1%)
patchpoint_no_singleton_class: 2,008 ( 0.1%)
patchpoint_no_ep_escape: 1,683 ( 0.0%)
obj_to_string_fallback: 1,358 ( 0.0%)
patchpoint_method_redefined: 1,212 ( 0.0%)
expandarray_failure: 837 ( 0.0%)
guard_super_method_entry: 737 ( 0.0%)
guard_less_failure: 163 ( 0.0%)
interrupt: 49 ( 0.0%)
send_count: 46,003,239
dynamic_send_count: 13,432,971 (29.2%)
optimized_send_count: 32,570,268 (70.8%)
dynamic_setivar_count: 1,029,286 ( 2.2%)
dynamic_getivar_count: 1,790,794 ( 3.9%)
dynamic_definedivar_count: 81,713 ( 0.2%)
iseq_optimized_send_count: 15,117,301 (32.9%)
inline_cfunc_optimized_send_count: 11,837,918 (25.7%)
inline_iseq_optimized_send_count: 884,606 ( 1.9%)
non_variadic_cfunc_optimized_send_count: 2,597,998 ( 5.6%)
variadic_cfunc_optimized_send_count: 2,132,445 ( 4.6%)
compiled_iseq_count: 5,259
failed_iseq_count: 0
compile_time: 1,409ms
profile_time: 10ms
gc_time: 11ms
invalidation_time: 77ms
vm_write_pc_count: 40,924,587
vm_write_sp_count: 40,924,587
vm_write_locals_count: 39,740,467
vm_write_stack_count: 39,740,467
vm_write_to_parent_iseq_local_count: 306,481
vm_read_from_parent_iseq_local_count: 4,841,855
guard_type_count: 48,810,089
guard_type_exit_ratio: 2.9%
guard_shape_count: 19,485,073
guard_shape_exit_ratio: 9.9%
code_region_bytes: 27,262,976
zjit_alloc_bytes: 34,517,324
total_mem_bytes: 61,780,300
side_exit_count: 3,770,125
total_insn_count: 273,152,243
vm_insn_count: 43,926,931
zjit_insn_count: 229,225,312
ratio_in_zjit: 83.9%
```
</details>
lobsters after
<details>
```
***ZJIT: Printing ZJIT statistics on exit***
Top-20 not inlined C methods (61.7% of total 5,220,252):
Hash#fetch: 1,274,409 (24.4%)
String#start_with?: 328,017 ( 6.3%)
Regexp#match?: 147,525 ( 2.8%)
Hash#key?: 139,198 ( 2.7%)
Kernel#is_a?: 110,178 ( 2.1%)
Class#allocate: 107,143 ( 2.1%)
Hash#delete: 106,307 ( 2.0%)
Class#superclass: 98,165 ( 1.9%)
Set#include?: 97,934 ( 1.9%)
Integer#===: 95,874 ( 1.8%)
Process.clock_gettime: 92,795 ( 1.8%)
String#sub!: 80,732 ( 1.5%)
String.new: 80,730 ( 1.5%)
SQLite3::Statement#done?: 73,532 ( 1.4%)
SQLite3::Statement#step: 73,532 ( 1.4%)
Time#plus_without_duration: 66,724 ( 1.3%)
String#<<: 63,954 ( 1.2%)
Kernel#dup: 62,590 ( 1.2%)
Time#to_i: 60,814 ( 1.2%)
Time#subsec: 60,363 ( 1.2%)
Top-20 calls to C functions from JIT code (80.8% of total 33,681,248):
rb_vm_opt_send_without_block: 6,869,559 (20.4%)
rb_hash_aref: 2,487,056 ( 7.4%)
rb_vm_env_write: 2,372,693 ( 7.0%)
rb_zjit_writebarrier_check_immediate: 2,238,890 ( 6.6%)
rb_vm_getinstancevariable: 1,861,700 ( 5.5%)
rb_ivar_get_at_no_ractor_check: 1,702,246 ( 5.1%)
rb_vm_send: 1,468,202 ( 4.4%)
rb_hash_aset: 1,267,469 ( 3.8%)
rb_obj_is_kind_of: 1,126,363 ( 3.3%)
rb_vm_setinstancevariable: 1,055,131 ( 3.1%)
Hash#fetch: 987,402 ( 2.9%)
rb_vm_opt_getconstant_path: 641,779 ( 1.9%)
rb_vm_invokesuper: 603,416 ( 1.8%)
rb_vm_invokeblock: 545,743 ( 1.6%)
rb_class_allocate_instance: 415,748 ( 1.2%)
rb_ec_ary_new_from_values: 380,080 ( 1.1%)
String#start_with?: 328,017 ( 1.0%)
rb_hash_new_with_size: 289,172 ( 0.9%)
fetch: 287,007 ( 0.9%)
rb_vm_sendforward: 283,885 ( 0.8%)
Top-1 not optimized method types for send (100.0% of total 428):
null: 428 (100.0%)
Top-3 not optimized method types for send_without_block (100.0% of total 202,329):
optimized_send: 190,504 (94.2%)
null: 10,844 ( 5.4%)
optimized_block_call: 981 ( 0.5%)
Top-3 not optimized method types for super (100.0% of total 517,421):
cfunc: 489,236 (94.6%)
alias: 26,398 ( 5.1%)
attrset: 1,787 ( 0.3%)
Top-4 instructions with uncategorized fallback reason (100.0% of total 867,452):
invokeblock: 545,743 (62.9%)
sendforward: 283,885 (32.7%)
invokesuperforward: 29,713 ( 3.4%)
opt_send_without_block: 8,111 ( 0.9%)
Top-20 send fallback reasons (100.0% of total 9,800,518):
singleton_class_seen: 3,293,078 (33.6%)
send_without_block_no_profiles: 2,142,301 (21.9%)
uncategorized: 867,452 ( 8.9%)
send_no_profiles: 820,538 ( 8.4%)
send_without_block_polymorphic: 780,065 ( 8.0%)
one_or_more_complex_arg_pass: 556,514 ( 5.7%)
super_not_optimized_method_type: 517,421 ( 5.3%)
send_without_block_not_optimized_method_type_optimized: 191,485 ( 2.0%)
send_without_block_megamorphic: 161,550 ( 1.6%)
too_many_args_for_lir: 127,190 ( 1.3%)
send_polymorphic: 111,290 ( 1.1%)
send_without_block_not_optimized_need_permission: 99,526 ( 1.0%)
super_polymorphic: 45,651 ( 0.5%)
super_complex_args_pass: 33,748 ( 0.3%)
obj_to_string_not_string: 13,794 ( 0.1%)
send_without_block_not_optimized_method_type: 10,844 ( 0.1%)
argc_param_mismatch: 9,927 ( 0.1%)
send_without_block_direct_keyword_mismatch: 6,336 ( 0.1%)
super_target_complex_args_pass: 5,108 ( 0.1%)
send_megamorphic: 4,525 ( 0.0%)
Top-4 setivar fallback reasons (100.0% of total 1,123,837):
not_monomorphic: 1,087,274 (96.7%)
not_t_object: 21,354 ( 1.9%)
too_complex: 15,188 ( 1.4%)
new_shape_needs_extension: 21 ( 0.0%)
Top-2 getivar fallback reasons (100.0% of total 2,132,203):
not_monomorphic: 2,092,243 (98.1%)
too_complex: 39,960 ( 1.9%)
Top-3 definedivar fallback reasons (100.0% of total 107,264):
not_monomorphic: 105,748 (98.6%)
too_complex: 796 ( 0.7%)
not_t_object: 720 ( 0.7%)
Top-6 invokeblock handler (100.0% of total 545,743):
monomorphic_iseq: 249,809 (45.8%)
polymorphic: 217,914 (39.9%)
monomorphic_ifunc: 46,244 ( 8.5%)
monomorphic_other: 27,938 ( 5.1%)
megamorphic: 2,943 ( 0.5%)
no_profiles: 895 ( 0.2%)
Top-8 popular complex argument-parameter features not optimized (100.0% of total 651,185):
param_forwardable: 233,989 (35.9%)
param_block: 205,158 (31.5%)
param_rest: 100,319 (15.4%)
param_kwrest: 44,596 ( 6.8%)
caller_blockarg: 21,863 ( 3.4%)
caller_kw_splat: 20,970 ( 3.2%)
caller_splat: 18,106 ( 2.8%)
caller_kwarg: 6,184 ( 0.9%)
Top-1 compile error reasons (100.0% of total 38,980):
exception_handler: 38,980 (100.0%)
Top-5 unhandled YARV insns (100.0% of total 4,154):
getconstant: 2,566 (61.8%)
checkmatch: 929 (22.4%)
setblockparam: 443 (10.7%)
once: 171 ( 4.1%)
expandarray: 45 ( 1.1%)
Top-3 unhandled HIR insns (100.0% of total 75,633):
throw: 39,447 (52.2%)
invokebuiltin: 35,775 (47.3%)
array_max: 411 ( 0.5%)
Top-20 side exit reasons (100.0% of total 3,734,975):
guard_shape_failure: 1,908,302 (51.1%)
guard_type_failure: 1,391,624 (37.3%)
block_param_proxy_not_iseq_or_ifunc: 246,820 ( 6.6%)
unhandled_hir_insn: 75,633 ( 2.0%)
compile_error: 38,980 ( 1.0%)
patchpoint_stable_constant_names: 25,375 ( 0.7%)
block_param_proxy_modified: 13,713 ( 0.4%)
fixnum_lshift_overflow: 10,085 ( 0.3%)
fixnum_mult_overflow: 8,550 ( 0.2%)
unhandled_yarv_insn: 4,154 ( 0.1%)
unhandled_block_arg: 2,548 ( 0.1%)
unhandled_newarray_send_pack: 2,322 ( 0.1%)
patchpoint_no_singleton_class: 2,008 ( 0.1%)
patchpoint_no_ep_escape: 1,683 ( 0.0%)
obj_to_string_fallback: 1,358 ( 0.0%)
expandarray_failure: 837 ( 0.0%)
patchpoint_method_redefined: 710 ( 0.0%)
guard_less_failure: 163 ( 0.0%)
guard_super_method_entry: 53 ( 0.0%)
interrupt: 38 ( 0.0%)
send_count: 45,128,693
dynamic_send_count: 9,800,518 (21.7%)
optimized_send_count: 35,328,175 (78.3%)
dynamic_setivar_count: 1,123,837 ( 2.5%)
dynamic_getivar_count: 2,132,203 ( 4.7%)
dynamic_definedivar_count: 107,264 ( 0.2%)
iseq_optimized_send_count: 15,891,453 (35.2%)
inline_cfunc_optimized_send_count: 12,866,297 (28.5%)
inline_iseq_optimized_send_count: 1,102,971 ( 2.4%)
non_variadic_cfunc_optimized_send_count: 2,857,775 ( 6.3%)
variadic_cfunc_optimized_send_count: 2,609,679 ( 5.8%)
compiled_iseq_count: 5,268
failed_iseq_count: 0
compile_time: 1,558ms
profile_time: 10ms
gc_time: 13ms
invalidation_time: 84ms
vm_write_pc_count: 39,300,901
vm_write_sp_count: 39,300,901
vm_write_locals_count: 38,133,357
vm_write_stack_count: 38,133,357
vm_write_to_parent_iseq_local_count: 305,249
vm_read_from_parent_iseq_local_count: 4,818,083
guard_type_count: 48,036,224
guard_type_exit_ratio: 2.9%
guard_shape_count: 19,302,903
guard_shape_exit_ratio: 9.9%
code_region_bytes: 29,491,200
zjit_alloc_bytes: 34,932,040
total_mem_bytes: 64,423,240
side_exit_count: 3,734,975
total_insn_count: 272,964,960
vm_insn_count: 46,583,034
zjit_insn_count: 226,381,926
ratio_in_zjit: 82.9%
```
</details>
Autosplat only happens due to `yield` or `.call`, neither of which is permitted in our trivial inliner.
For now the provided size is just for GC statistics, but in the future we may want to forward it to C23's `free_sized` and passing an incorrect size to it is undefined behavior.
This PR is an extension of the work in ruby#15816. There, we optimized `super` calls where the target method was an ISeq. The code bailed on any other `super` target method type. The discussion for that PR included the ZJIT stats from running the _railsbench_ benchmark in _ruby-bench_. The stats showed the other types of `super` calls we encountered that we didn't process: ``` Top-2 not optimized method types for super (100.0% of total 2,700,015): cfunc: 2,680,044 (99.3%) attrset: 19,971 ( 0.7%) ``` This PR handles most of the cfunc cases. We still only handle simple method signatures and don't handle blocks at all, but if the target function is a cfunc where `argc != 2`, we now optimize to either `Insn::CCallWithFrame` or `Insn::CCallVariadic` as appropriate. This covers 100% of the C func cases we encounter in _railsbench_. <details><summary>Baseline ZJIT stats</summary> <p> ``` Top-20 not inlined C methods (51.1% of total 15,736,824): Hash#key?: 1,260,867 ( 8.0%) Regexp#match?: 970,899 ( 6.2%) Hash#fetch: 898,248 ( 5.7%) Integer#===: 439,075 ( 2.8%) Hash#delete: 405,821 ( 2.6%) Array#any?: 403,598 ( 2.6%) String.new: 401,818 ( 2.6%) String#b: 319,473 ( 2.0%) String#to_sym: 272,868 ( 1.7%) Array#all?: 260,132 ( 1.7%) Fiber.current: 259,588 ( 1.6%) Array#join: 257,125 ( 1.6%) Array#include?: 247,718 ( 1.6%) Kernel#Array: 244,574 ( 1.6%) String#<<: 242,475 ( 1.5%) Symbol#end_with?: 239,977 ( 1.5%) String#force_encoding: 239,520 ( 1.5%) Kernel#dup: 232,701 ( 1.5%) Array#[]: 225,160 ( 1.4%) Kernel#respond_to?: 220,246 ( 1.4%) Top-20 calls to C functions from JIT code (75.3% of total 106,711,108): rb_vm_opt_send_without_block: 22,031,658 (20.6%) rb_hash_aref: 9,335,540 ( 8.7%) rb_vm_env_write: 7,865,750 ( 7.4%) rb_vm_send: 6,836,936 ( 6.4%) rb_zjit_writebarrier_check_immediate: 5,623,383 ( 5.3%) rb_vm_getinstancevariable: 5,012,846 ( 4.7%) rb_ivar_get_at_no_ractor_check: 4,868,219 ( 4.6%) rb_vm_invokesuper: 3,240,208 ( 3.0%) rb_hash_aset: 2,061,526 ( 1.9%) rb_obj_is_kind_of: 1,812,573 ( 1.7%) rb_vm_invokeblock: 1,647,238 ( 1.5%) rb_vm_opt_getconstant_path: 1,295,958 ( 1.2%) Hash#key?: 1,260,867 ( 1.2%) rb_class_allocate_instance: 1,190,707 ( 1.1%) rb_hash_new_with_size: 1,150,766 ( 1.1%) rb_vm_setinstancevariable: 1,119,304 ( 1.0%) rb_ec_ary_new_from_values: 1,050,781 ( 1.0%) rb_obj_alloc: 993,445 ( 0.9%) rb_str_concat_literals: 984,558 ( 0.9%) Regexp#match?: 970,899 ( 0.9%) Top-2 not optimized method types for send (100.0% of total 3,423,067): iseq: 3,410,096 (99.6%) optimized: 12,971 ( 0.4%) Top-2 not optimized method types for send_without_block (100.0% of total 319,311): optimized_send: 246,250 (77.1%) null: 73,061 (22.9%) Top-2 not optimized method types for super (100.0% of total 2,680,495): cfunc: 2,660,334 (99.2%) attrset: 20,161 ( 0.8%) Top-4 instructions with uncategorized fallback reason (100.0% of total 2,617,553): invokeblock: 1,647,238 (62.9%) sendforward: 748,101 (28.6%) invokesuperforward: 199,443 ( 7.6%) opt_send_without_block: 22,771 ( 0.9%) Top-20 send fallback reasons (100.0% of total 34,703,584): send_without_block_polymorphic: 12,818,893 (36.9%) send_without_block_no_profiles: 5,442,960 (15.7%) send_not_optimized_method_type: 3,423,067 ( 9.9%) super_not_optimized_method_type: 2,680,495 ( 7.7%) uncategorized: 2,617,553 ( 7.5%) send_no_profiles: 2,083,822 ( 6.0%) one_or_more_complex_arg_pass: 1,663,149 ( 4.8%) send_polymorphic: 1,329,141 ( 3.8%) send_without_block_not_optimized_need_permission: 510,815 ( 1.5%) too_many_args_for_lir: 477,266 ( 1.4%) singleton_class_seen: 441,058 ( 1.3%) super_complex_args_pass: 331,767 ( 1.0%) send_without_block_not_optimized_method_type_optimized: 246,250 ( 0.7%) send_without_block_megamorphic: 228,672 ( 0.7%) super_target_complex_args_pass: 165,855 ( 0.5%) send_without_block_not_optimized_method_type: 73,061 ( 0.2%) obj_to_string_not_string: 67,862 ( 0.2%) super_call_with_block: 40,004 ( 0.1%) send_without_block_direct_keyword_mismatch: 39,783 ( 0.1%) super_polymorphic: 22,087 ( 0.1%) Top-3 setivar fallback reasons (100.0% of total 1,119,304): not_monomorphic: 1,077,792 (96.3%) not_t_object: 41,335 ( 3.7%) new_shape_needs_extension: 177 ( 0.0%) Top-1 getivar fallback reasons (100.0% of total 5,012,871): not_monomorphic: 5,012,871 (100.0%) Top-2 definedivar fallback reasons (100.0% of total 142,798): not_monomorphic: 142,711 (99.9%) not_t_object: 87 ( 0.1%) Top-6 invokeblock handler (100.0% of total 1,647,238): monomorphic_iseq: 878,253 (53.3%) polymorphic: 483,612 (29.4%) monomorphic_other: 134,943 ( 8.2%) monomorphic_ifunc: 115,175 ( 7.0%) megamorphic: 34,939 ( 2.1%) no_profiles: 316 ( 0.0%) Top-8 popular complex argument-parameter features not optimized (100.0% of total 2,068,581): param_forwardable: 729,353 (35.3%) param_block: 716,533 (34.6%) param_rest: 327,865 (15.8%) caller_splat: 114,365 ( 5.5%) caller_kw_splat: 99,266 ( 4.8%) param_kwrest: 80,149 ( 3.9%) caller_blockarg: 877 ( 0.0%) caller_kwarg: 173 ( 0.0%) Top-1 compile error reasons (100.0% of total 156,707): exception_handler: 156,707 (100.0%) Top-5 unhandled YARV insns (100.0% of total 201,517): getconstant: 160,920 (79.9%) expandarray: 19,985 ( 9.9%) setblockparam: 19,972 ( 9.9%) checkmatch: 480 ( 0.2%) once: 160 ( 0.1%) Top-2 unhandled HIR insns (100.0% of total 128,647): throw: 93,060 (72.3%) invokebuiltin: 35,587 (27.7%) Top-19 side exit reasons (100.0% of total 3,484,374): guard_shape_failure: 1,042,511 (29.9%) guard_type_failure: 812,342 (23.3%) block_param_proxy_not_iseq_or_ifunc: 795,628 (22.8%) unhandled_yarv_insn: 201,517 ( 5.8%) compile_error: 156,707 ( 4.5%) unhandled_hir_insn: 128,647 ( 3.7%) unhandled_newarray_send_pack: 119,187 ( 3.4%) patchpoint_method_redefined: 80,619 ( 2.3%) unhandled_block_arg: 60,517 ( 1.7%) block_param_proxy_modified: 49,695 ( 1.4%) guard_less_failure: 20,033 ( 0.6%) fixnum_lshift_overflow: 9,985 ( 0.3%) patchpoint_stable_constant_names: 5,752 ( 0.2%) fixnum_mult_overflow: 570 ( 0.0%) obj_to_string_fallback: 498 ( 0.0%) patchpoint_no_ep_escape: 109 ( 0.0%) interrupt: 43 ( 0.0%) guard_super_method_entry: 8 ( 0.0%) guard_greater_eq_failure: 6 ( 0.0%) send_count: 133,679,714 dynamic_send_count: 34,703,584 (26.0%) optimized_send_count: 98,976,130 (74.0%) dynamic_setivar_count: 1,119,304 ( 0.8%) dynamic_getivar_count: 5,012,871 ( 3.7%) dynamic_definedivar_count: 142,798 ( 0.1%) iseq_optimized_send_count: 38,085,055 (28.5%) inline_cfunc_optimized_send_count: 39,628,908 (29.6%) inline_iseq_optimized_send_count: 3,624,852 ( 2.7%) non_variadic_cfunc_optimized_send_count: 10,434,756 ( 7.8%) variadic_cfunc_optimized_send_count: 7,202,559 ( 5.4%) compiled_iseq_count: 2,868 failed_iseq_count: 0 compile_time: 8,809ms profile_time: 135ms gc_time: 255ms invalidation_time: 21ms vm_write_pc_count: 116,809,164 vm_write_sp_count: 116,809,164 vm_write_locals_count: 111,533,227 vm_write_stack_count: 111,533,227 vm_write_to_parent_iseq_local_count: 521,277 vm_read_from_parent_iseq_local_count: 12,757,231 guard_type_count: 126,653,751 guard_type_exit_ratio: 0.6% guard_shape_count: 44,193,824 guard_shape_exit_ratio: 2.4% code_region_bytes: 14,336,000 zjit_alloc_bytes: 19,282,889 total_mem_bytes: 33,618,889 side_exit_count: 3,484,374 total_insn_count: 697,672,179 vm_insn_count: 52,531,010 zjit_insn_count: 645,141,169 ratio_in_zjit: 92.5% ``` </p> </details> <details><summary>Optimized invokesuper stats</summary> <p> ``` Top-20 not inlined C methods (51.1% of total 15,736,852): Hash#key?: 1,260,867 ( 8.0%) Regexp#match?: 970,900 ( 6.2%) Hash#fetch: 898,248 ( 5.7%) Integer#===: 439,075 ( 2.8%) Hash#delete: 405,825 ( 2.6%) Array#any?: 403,600 ( 2.6%) String.new: 401,818 ( 2.6%) String#b: 319,473 ( 2.0%) String#to_sym: 272,868 ( 1.7%) Array#all?: 260,132 ( 1.7%) Fiber.current: 259,588 ( 1.6%) Array#join: 257,125 ( 1.6%) Array#include?: 247,718 ( 1.6%) Kernel#Array: 244,579 ( 1.6%) String#<<: 242,475 ( 1.5%) Symbol#end_with?: 239,977 ( 1.5%) String#force_encoding: 239,520 ( 1.5%) Kernel#dup: 232,706 ( 1.5%) Array#[]: 225,160 ( 1.4%) Kernel#respond_to?: 220,246 ( 1.4%) Top-20 calls to C functions from JIT code (73.2% of total 106,690,862): rb_vm_opt_send_without_block: 22,031,722 (20.7%) rb_hash_aref: 9,335,543 ( 8.8%) rb_vm_env_write: 7,865,751 ( 7.4%) rb_vm_send: 6,836,939 ( 6.4%) rb_zjit_writebarrier_check_immediate: 5,623,259 ( 5.3%) rb_vm_getinstancevariable: 5,012,844 ( 4.7%) rb_ivar_get_at_no_ractor_check: 4,868,219 ( 4.6%) rb_hash_aset: 2,061,385 ( 1.9%) rb_obj_is_kind_of: 1,812,575 ( 1.7%) rb_vm_invokeblock: 1,647,238 ( 1.5%) rb_vm_opt_getconstant_path: 1,295,958 ( 1.2%) Hash#key?: 1,260,867 ( 1.2%) rb_class_allocate_instance: 1,190,704 ( 1.1%) rb_hash_new_with_size: 1,150,765 ( 1.1%) rb_vm_setinstancevariable: 1,119,304 ( 1.0%) rb_ec_ary_new_from_values: 1,050,780 ( 1.0%) rb_obj_alloc: 993,446 ( 0.9%) rb_str_concat_literals: 984,559 ( 0.9%) Regexp#match?: 970,900 ( 0.9%) rb_obj_as_string_result: 937,751 ( 0.9%) Top-2 not optimized method types for send (100.0% of total 3,423,067): iseq: 3,410,096 (99.6%) optimized: 12,971 ( 0.4%) Top-2 not optimized method types for send_without_block (100.0% of total 319,311): optimized_send: 246,250 (77.1%) null: 73,061 (22.9%) Top-1 not optimized method types for super (100.0% of total 20,161): attrset: 20,161 (100.0%) Top-4 instructions with uncategorized fallback reason (100.0% of total 2,617,553): invokeblock: 1,647,238 (62.9%) sendforward: 748,101 (28.6%) invokesuperforward: 199,443 ( 7.6%) opt_send_without_block: 22,771 ( 0.9%) Top-20 send fallback reasons (100.0% of total 32,043,318): send_without_block_polymorphic: 12,818,949 (40.0%) send_without_block_no_profiles: 5,442,967 (17.0%) send_not_optimized_method_type: 3,423,067 (10.7%) uncategorized: 2,617,553 ( 8.2%) send_no_profiles: 2,083,824 ( 6.5%) one_or_more_complex_arg_pass: 1,663,150 ( 5.2%) send_polymorphic: 1,329,142 ( 4.1%) send_without_block_not_optimized_need_permission: 510,814 ( 1.6%) too_many_args_for_lir: 477,267 ( 1.5%) singleton_class_seen: 441,058 ( 1.4%) super_complex_args_pass: 331,767 ( 1.0%) send_without_block_not_optimized_method_type_optimized: 246,250 ( 0.8%) send_without_block_megamorphic: 228,672 ( 0.7%) super_target_complex_args_pass: 165,855 ( 0.5%) send_without_block_not_optimized_method_type: 73,061 ( 0.2%) obj_to_string_not_string: 67,862 ( 0.2%) super_call_with_block: 40,004 ( 0.1%) send_without_block_direct_keyword_mismatch: 39,783 ( 0.1%) super_polymorphic: 22,088 ( 0.1%) super_not_optimized_method_type: 20,161 ( 0.1%) Top-3 setivar fallback reasons (100.0% of total 1,119,304): not_monomorphic: 1,077,792 (96.3%) not_t_object: 41,335 ( 3.7%) new_shape_needs_extension: 177 ( 0.0%) Top-1 getivar fallback reasons (100.0% of total 5,012,869): not_monomorphic: 5,012,869 (100.0%) Top-2 definedivar fallback reasons (100.0% of total 142,798): not_monomorphic: 142,711 (99.9%) not_t_object: 87 ( 0.1%) Top-6 invokeblock handler (100.0% of total 1,647,238): monomorphic_iseq: 878,253 (53.3%) polymorphic: 483,612 (29.4%) monomorphic_other: 134,943 ( 8.2%) monomorphic_ifunc: 115,175 ( 7.0%) megamorphic: 34,939 ( 2.1%) no_profiles: 316 ( 0.0%) Top-8 popular complex argument-parameter features not optimized (100.0% of total 2,068,582): param_forwardable: 729,353 (35.3%) param_block: 716,534 (34.6%) param_rest: 327,865 (15.8%) caller_splat: 114,365 ( 5.5%) caller_kw_splat: 99,266 ( 4.8%) param_kwrest: 80,149 ( 3.9%) caller_blockarg: 877 ( 0.0%) caller_kwarg: 173 ( 0.0%) Top-1 compile error reasons (100.0% of total 156,707): exception_handler: 156,707 (100.0%) Top-5 unhandled YARV insns (100.0% of total 201,517): getconstant: 160,920 (79.9%) expandarray: 19,985 ( 9.9%) setblockparam: 19,972 ( 9.9%) checkmatch: 480 ( 0.2%) once: 160 ( 0.1%) Top-2 unhandled HIR insns (100.0% of total 128,646): throw: 93,060 (72.3%) invokebuiltin: 35,586 (27.7%) Top-19 side exit reasons (100.0% of total 3,504,293): guard_shape_failure: 1,042,515 (29.7%) guard_type_failure: 812,249 (23.2%) block_param_proxy_not_iseq_or_ifunc: 795,628 (22.7%) unhandled_yarv_insn: 201,517 ( 5.8%) compile_error: 156,707 ( 4.5%) unhandled_hir_insn: 128,646 ( 3.7%) unhandled_newarray_send_pack: 119,187 ( 3.4%) patchpoint_method_redefined: 80,779 ( 2.3%) unhandled_block_arg: 60,517 ( 1.7%) block_param_proxy_modified: 49,695 ( 1.4%) guard_less_failure: 20,033 ( 0.6%) guard_super_method_entry: 19,855 ( 0.6%) fixnum_lshift_overflow: 9,985 ( 0.3%) patchpoint_stable_constant_names: 5,752 ( 0.2%) fixnum_mult_overflow: 569 ( 0.0%) obj_to_string_fallback: 498 ( 0.0%) patchpoint_no_ep_escape: 109 ( 0.0%) interrupt: 46 ( 0.0%) guard_greater_eq_failure: 6 ( 0.0%) send_count: 133,600,402 dynamic_send_count: 32,043,318 (24.0%) optimized_send_count: 101,557,084 (76.0%) dynamic_setivar_count: 1,119,304 ( 0.8%) dynamic_getivar_count: 5,012,869 ( 3.8%) dynamic_definedivar_count: 142,798 ( 0.1%) iseq_optimized_send_count: 38,025,870 (28.5%) inline_cfunc_optimized_send_count: 39,628,762 (29.7%) inline_iseq_optimized_send_count: 3,624,854 ( 2.7%) non_variadic_cfunc_optimized_send_count: 12,631,917 ( 9.5%) variadic_cfunc_optimized_send_count: 7,645,681 ( 5.7%) compiled_iseq_count: 2,870 failed_iseq_count: 0 compile_time: 8,419ms profile_time: 133ms gc_time: 248ms invalidation_time: 20ms vm_write_pc_count: 116,729,857 vm_write_sp_count: 116,729,857 vm_write_locals_count: 111,453,921 vm_write_stack_count: 111,453,921 vm_write_to_parent_iseq_local_count: 521,275 vm_read_from_parent_iseq_local_count: 12,757,225 guard_type_count: 126,594,209 guard_type_exit_ratio: 0.6% guard_shape_count: 44,193,683 guard_shape_exit_ratio: 2.4% code_region_bytes: 14,368,768 zjit_alloc_bytes: 19,581,578 total_mem_bytes: 33,950,346 side_exit_count: 3,504,293 total_insn_count: 697,692,070 vm_insn_count: 52,828,675 zjit_insn_count: 644,863,395 ratio_in_zjit: 92.4% ``` </p> </details>
…#16002) Fix NEWOBJ hook calling cruby functions on objects not filled yet. Objects like `TypedData` need to be zeroed out when calling `rb_obj_memsize_of`. Other object types need `fields_obj` to be 0 when they don't have one, etc. Fixes [Bug #21854]
Bumps [actions/cache](https://github.com/actions/cache) from 5.0.2 to 5.0.3. - [Release notes](https://github.com/actions/cache/releases) - [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md) - [Commits](actions/cache@8b402f5...cdf6c1f) --- updated-dependencies: - dependency-name: actions/cache dependency-version: 5.0.3 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>
Bumps [actions/cache](https://github.com/actions/cache) from 5.0.2 to 5.0.3. - [Release notes](https://github.com/actions/cache/releases) - [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md) - [Commits](actions/cache@8b402f5...cdf6c1f) --- updated-dependencies: - dependency-name: actions/cache dependency-version: 5.0.3 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>
…by#8989 from nobu/test-tmpdir"" This reverts commit ruby/rubygems@6e00da098aba. ruby/rubygems@c6abdae812
When an object fails to be made shareable with `Ractor.make_shareable` or when an unshareable object is accessed through module constants or module instance variables, the error message now includes the chain of references that leads to the unshareable value.
ee0ecd3 to
eceb099
Compare
Improve the messages of exceptions raised by the Ractor implementation.
When an object fails to be made shareable with
Ractor.make_shareableor when an unshareable object is accessed through module constants or module instance variables, the error message now includes the chain of references that leads to the unshareable value.