SlideShare a Scribd company logo
1 of 64
Download to read offline
Symbol GC 
#rubykaigi 2014 
Narihiro Nakamura - @nari3
Self introduction
Self introduction 
✔ Nari, @nari3, authorNari 
✔ A CRuby committer. 
✔ I work at NaCl. 
✔ “Nakamura” 
✔ is the most powerful clan in Ruby World.
Author 
http://tatsu-zziinnee..ccoomm//bbooookkss//ggccbbooookk
An unmotivated rubyist.
Today's topic 
obj = Object.new 
100_000.times do |i| 
obj.respond_to?("sym#{i}".to_sym) 
end 
GC.start 
puts"symbol : #{Symbol.all_symbols.size}" 
$ ruby-2.1.2 a.rb 
symbol : 102416 
$ ruby-trunk 
symbol : 2833
What is Symbol?
Symbol 
✔ A symbol is a primitive data type whose 
instances have a unique human-readable 
form. 
✔ Symbols can be used as identifiers.
:symbol
A pitfall of Symbol
A pitfall of Symbol 
✔ All symbols are not garbage collected. 
✔ Many beginners don't know this fact. 
✔ Make a mistake even good rubyists. 
✔ Prone to vulnerability 
✔ User input → symbol 
✔ Compress the memory
Simple cases 
✖ if user.respond_to(params[:method].to_sym) 
Is this method callable? 
NG: params[:method] is user input 
✖ params[params[:attr].to_sym] 
Get a value of a hash via a symbol key. 
NG: params[:attr] is user input.
Rails DoS Vulnerability 
CVE-2012-3424 
HTTP Request: GET 
…. 
WWW-Authenticate: 
Digest 
digest = { 
to_sym :realm => “..”, 
to_sym :nonce => “..”, 
realm="..", 
nonce="...", 
algorithm=MD5, 
qop="auth" Parse to a hash 
} 
, 
foo=”xxx”, 
.., 
:foo => “..”, 
to_sym 
. . .,
We want Symbol GC 
✔ There is this request from long time ago. 
✔ Sasada-san has an idea. 
✔ I will implement this idea.
Are symbols in other 
programming 
languages garbage 
collectable?
Programming languages 
which supported for Symbol 
✔ Too Many Parentheses Languages 
✔ Erlang 
✔ Smalltalk 
✔ Scala
Symbol GC support 
Language Symbol GC 
Erlang ✖ 
Gauche ✖ 
Clojure ○ 
EmacsLisp ✖ 
VisualWorks(Smalltalk) ○ 
Scala ○
Implementation dependency? 
✔ Not unified. 
✔ Symbol GC is undocumented in 
programing language specifications. 
✔ Implementation = Specification?
EmacsLisp 
✔ Function: unintern 
✔ (unintern 'foo) 
✔ Declare an unnecessary symbol. 
✔ It's like manual memory management.
Scala 
Java main.scala 
01: val a = 'sym 
Symbol Table String 
“sym”
Scala 
Java main.scala 
01: val a = 'sym 
02: a = null 
String 
“sym” 
Symbol Table 
Weak Reference 
GCG SCT ASRTATRT
Details of CRuby's 
Symbol
“sym”.to_sym 
C Ruby 
global_symbols ““ssyymm”” 
“sym” 
String 
sym_id(hash) 
・ 
・・ 
last_id(long) 
1000 
“sym” freeze 
freeze String 
“sym” 
1001 
Frozen String
“sym”.to_sym 
C Ruby 
global_symbols ““ssyymm”” 
“sym” 
String 
sym_id(hash) 
1001 
・ 
・・ 
last_id(long) 
“sym” freeze 
freeze String 
“sym” 
1001 
ID: 1001 
ID2SYM(ID) 
SYMBOL 
(VALUE) 
:sym 
Frozen String
ID 
✔ ID: Used by C Level. 
✔ Store ID to a method table or a variable table. 
✔ An unique number that corresponds to a symbol. 
✔ Created by rb_intern(“foo”) of C API. 
✔ :sym == :sym → 1001 == 1001
SYMBOL(VALUE) 
✔ SYMBOL(VALUE): Used by Ruby Level. 
✔ An raw data of :sym or ”sym”.to_sym 
✔ Uncollectable.
Why can't collect 
garbage symbols.
For example, it stores ID to the static 
area of the C extension 
C Ruby 
global_symbols 
sym_id(hash) 
“foo” 
1001 
・ 
・・ 
last_id(long) 
1001 
SYMBOL 
(VALUE) 
:foo 
Ruby's C extension 
static public ID id; 
SYM2ID(:foo) 1001
If :foo is collected, 
ID in sym_id will be deleted. 
C Ruby 
global_symbols 
sym_id(hash) 
“foo” 
1001 
・ 
・・ 
last_id(long) 
1001 
SYMBOL 
(VALUE) 
:foo 
Ruby's C extension 
static public ID id; 
1001 GC START
Then “foo”.to_sym is called. 
:foo == :foo but different ID 
C Ruby 
global_symbols 
sym_id(hash) 
“foo” 
1002 
・ 
・・ 
last_id(long) 
1001 
SYMBOL 
(VALUE) 
:foo 
Ruby's C extension 
static public ID id; 
1001 
1002 
Different 
SYM2ID(:foo) != id
Why can't collect 
garbage symbols 
✔ Problem: ID remaining in the C side. 
✔ We can't detect and manage all IDs in C extension. 
✔ Same symbol but different ID 
✔ It will create an inconsistent ID.
In Ruby world 
RRIIPP.. AA ssyymmbbooll iiss ddeeaadd...... 
Photo by MIKI Yoshihito, https://www.flickr.com/pphhoottooss//mmuujjiittrraa//77557711002222449900
In C world 
WWRRRRRRYYYYYYYYYY!!!!!! 
II''mm ssttiillll aalliivvee........!! 
IIDD 
Photo by Zufallsfaktor, https://www.flickr.com/photos/zzuuffaallllssffaakkttoorr//55991111333388995599
How do you create 
Symbol GC?
Idea
Separates into two types of symbols 
Immortal 
Symbol 
Mortal 
Symbol 
CC WWoorrlldd RRuubbyy WWoorrlldd
Immortal Symbol 
✔ These symbols have the ID corresponding 
✔ e.g. method name, variable name, constant name, etc... 
✔ use in C-level mainly 
✔ Uncollectable 
✔ Symbol stay alive after numbering the ID 
once 
✔ There is no transition to Mortal Symbol.
def foo; end 
C Ruby 
global_symbols 
sym_id(hash) 
“foo” 
1001 
・ 
・・ 
last_id(long) 
1001 
Frozen String 
“foo”
Store an ID to the method table 
C Ruby 
global_symbols 
sym_id(hash) 
“foo” 
1001 
・ 
・・ 
last_id(long) 
1001 
Frozen String 
“foo” 
Method table 
1001 def foo; end
ID2SYM(ID) → VALUE 
C Ruby 
global_symbols 
sym_id(hash) 
“foo” 
1001 
・ 
・・ 
last_id(long) 
1001 
“foo” 
ID: 1001 
ID2SYM(ID) 
Immortal 
Symbol 
(VALUE) 
:foo 
Frozen String 
Method table 
1001 def foo; end
Mortal Symbol 
✔ These symbols don't have ID 
✔ “sym”.to_sym → Mortal Symbol 
✔ use in Ruby-level mainly 
✔ Collectable 
✔ Unreachable symbols are collected. 
✔ There is transition to Immortal Symbol.
“bar”.to_sym 
C Ruby 
global_symbols 
sym_id(hash) 
“foo” 
1001 
・ 
・・ 
last_id(long) 
1001 
“foo” 
ID: 1001 
ID2SYM(ID) 
Immortal 
Symbol 
(VALUE) 
:foo 
Frozen String 
Mortal 
Symbol 
(VALUE) 
:bar 
Frozen String 
“bar” 
“bar” :bar
Splits uncollectable or 
collectable objects 
C Ruby 
global_symbols 
sym_id(hash) 
“foo” 
1001 
・ 
・・ 
last_id(long) 
1001 
“foo” 
ID: 1001 
Uncollectable 
ID2SYM(ID) 
Immortal 
Symbol 
(VALUE) 
:foo 
Frozen String 
Mortal 
Symbol 
(VALUE) 
:bar 
Collectable 
Frozen String 
“bar” 
“bar” :bar
:bar will be collected 
C Ruby 
global_symbols 
sym_id(hash) 
“foo” 
1001 
・ 
・・ 
last_id(long) 
1001 
“foo” 
ID: 1001 
ID2SYM(ID) 
Immortal 
Symbol 
(VALUE) 
:foo 
Frozen String 
Mortal 
Symbol 
(VALUE) 
:bar 
Frozen String 
“bar” 
“bar” :bar
If you already have 
Immortal Symbol of 
the same name
def foo; end 
C Ruby 
global_symbols 
sym_id(hash) 
“foo” 
1001 
・ 
・・ 
last_id(long) 
1001 
“foo” 
ID: 1001 
ID2SYM(ID) 
Immortal 
Symbol 
(VALUE) 
:foo 
Frozen String
“foo”.to_sym 
C Ruby 
global_symbols 
sym_id(hash) 
“foo” 
1001 
・ 
・・ 
last_id(long) 
1001 
“foo” 
ID:1001 
ID2SYM(ID) 
Immortal 
Symbol 
(VALUE) 
:foo 
Frozen String 
Mortal 
Symbol 
(VALUE) 
:foo 
Check 
Use this one
From Mortal Symbol 
to Immortal Symbol
define_method(“foo”.to_sym){} 
C Ruby 
Mortal 
Symbol 
(VALUE) 
:foo 
global_symbols 
sym_id(hash) 
“foo” 
:foo
define_method(“foo”.to_sym){} 
C Ruby 
Immortal 
Mortal 
Symbol 
(VALUE) 
:foo 
global_symbols 
sym_id(hash) 
Method table 
0x2c8d0 def foo; end 
SYM2ID(VALUE) 
0x2c8d0 
Pin down 
UUnnccoolllleeccttaabbllee 
Address = ID 
“foo” 
:foo
CAUTION
A new pitfall is 
coming!
Immortal Symbol 
✔ All symbols are garbage collected. 
✔ Immortal symbols are not garbage 
collected. 
✔ Mortal → Immortal symbol when 
numbering an ID. 
✔ This still lead to vulnerability!
A new pitfall 
✔ Immortal Symbol is increase 
unintentionally. 
✔ For instance: Get a name from a symbol 
✔ rb_id2str(SYM2ID(sym)) 
✔ Mortal → Immortal 
✔ Please use rb_sym2str() 
✔ Please attention to unconsidered SYM2ID().
Please keep to monitor 
✔ Check Symbol.all_symbols.size 
✔ Please report a bug to ruby-core or library author if 
increase number of symbols. 
✔ It's a transition period now. 
✔ It will get better gradually.
Details of 
implementation 
(for CRuby Hackers)
Static Symbol, 
Dynamic Symbol 
✔ Static Symbol = Immediate value 
✔ Immortal 
✔ Dynamic Symbol = RVALUE 
✔ Mortal or Immortal 
✔ Change to immortal symbol when needs ID. 
✔ Similar to Float and FLONUM
Details of RSymbol struct 
struct RSymbol { 
struct RBasic basic; 
VALUE fstr; 
ID type; 
}; 
Frozen String 
“foo” 
ID_LOCAL 0b00000 
ID_INSTANCE 0b00010 
ID_GLOBAL 0b00110 
ID_ATTRSET 0b01000 
・・ 
・
ID Structure 
0bxxx.....xxx 000 
High-order 61 bits = Counter Low-order 3 bits = ID type 
0bxxx.....xx 000 
1 
Low-order 1 bit = Static Symbol Flag
Fast recognize ID 
✔ Low-order 1bit = 1 → Static Symbol 
✔ Dynamic Symbol ID = RVALUE address 
✔ Low order 1 bit = 0 
✔ It's only check of the lower 1 bit.
Conclusion
Conclusion 
✔ Most symbols will be garbage collected. 
✔ But some symbols won't be garbage 
collected. 
✔ “sym”.to_sym → OK 
✔ define_method(“sym”.to_sym){} → NG
Acknowledgments 
✔ Sasada-san 
✔ Teaches me an idea of Symbol GC. 
✔ Refines code of Symbol GC. 
✔ Nakada-san, Tsujimoto-san, U.Nakamura-san, 
etc... 
✔ Fixes many bugs. 
✔ NaCl members
Thank you!

More Related Content

What's hot

条件分岐とcmovとmaxps
条件分岐とcmovとmaxps条件分岐とcmovとmaxps
条件分岐とcmovとmaxps
MITSUNARI Shigeo
 
Deterministic simulation testing
Deterministic simulation testingDeterministic simulation testing
Deterministic simulation testing
FoundationDB
 

What's hot (20)

Xen Debugging
Xen DebuggingXen Debugging
Xen Debugging
 
Vivado hls勉強会5(axi4 stream)
Vivado hls勉強会5(axi4 stream)Vivado hls勉強会5(axi4 stream)
Vivado hls勉強会5(axi4 stream)
 
ネイティブコードを語る
ネイティブコードを語るネイティブコードを語る
ネイティブコードを語る
 
Verilog-HDL Tutorial (11)
Verilog-HDL Tutorial (11)Verilog-HDL Tutorial (11)
Verilog-HDL Tutorial (11)
 
MIPI DevCon 2021: Meeting the Needs of Next-Generation Displays with a High-P...
MIPI DevCon 2021: Meeting the Needs of Next-Generation Displays with a High-P...MIPI DevCon 2021: Meeting the Needs of Next-Generation Displays with a High-P...
MIPI DevCon 2021: Meeting the Needs of Next-Generation Displays with a High-P...
 
Revisit DCA, PCIe TPH and DDIO
Revisit DCA, PCIe TPH and DDIORevisit DCA, PCIe TPH and DDIO
Revisit DCA, PCIe TPH and DDIO
 
Return to dlresolve
Return to dlresolveReturn to dlresolve
Return to dlresolve
 
Python.pptx
Python.pptxPython.pptx
Python.pptx
 
BKK16-315 Graphics Stack Update
BKK16-315 Graphics Stack UpdateBKK16-315 Graphics Stack Update
BKK16-315 Graphics Stack Update
 
Introduction of AMD Virtual Interrupt Controller
Introduction of AMD Virtual Interrupt ControllerIntroduction of AMD Virtual Interrupt Controller
Introduction of AMD Virtual Interrupt Controller
 
用Raspberry Pi 學Linux I2C Driver
用Raspberry Pi 學Linux I2C Driver用Raspberry Pi 學Linux I2C Driver
用Raspberry Pi 學Linux I2C Driver
 
Fun with Lambdas: C++14 Style (part 1)
Fun with Lambdas: C++14 Style (part 1)Fun with Lambdas: C++14 Style (part 1)
Fun with Lambdas: C++14 Style (part 1)
 
Zynq + Vivado HLS入門
Zynq + Vivado HLS入門Zynq + Vivado HLS入門
Zynq + Vivado HLS入門
 
Vivado hls勉強会1(基礎編)
Vivado hls勉強会1(基礎編)Vivado hls勉強会1(基礎編)
Vivado hls勉強会1(基礎編)
 
Tp2 matlab
Tp2 matlab Tp2 matlab
Tp2 matlab
 
条件分岐とcmovとmaxps
条件分岐とcmovとmaxps条件分岐とcmovとmaxps
条件分岐とcmovとmaxps
 
Vivado hls勉強会2(レジスタの挿入とpipelineディレクティブ)
Vivado hls勉強会2(レジスタの挿入とpipelineディレクティブ)Vivado hls勉強会2(レジスタの挿入とpipelineディレクティブ)
Vivado hls勉強会2(レジスタの挿入とpipelineディレクティブ)
 
Deterministic simulation testing
Deterministic simulation testingDeterministic simulation testing
Deterministic simulation testing
 
SSE4.2の文字列処理命令の紹介
SSE4.2の文字列処理命令の紹介SSE4.2の文字列処理命令の紹介
SSE4.2の文字列処理命令の紹介
 
2.2 stack applications Infix to Postfix & Evaluation of Post Fix
2.2 stack applications Infix to Postfix & Evaluation of Post Fix2.2 stack applications Infix to Postfix & Evaluation of Post Fix
2.2 stack applications Infix to Postfix & Evaluation of Post Fix
 

Viewers also liked

G1GCへ伸びていた「いばらの道」
G1GCへ伸びていた「いばらの道」G1GCへ伸びていた「いばらの道」
G1GCへ伸びていた「いばらの道」
Narihiro Nakamura
 
第七回 渋谷Java - Apache Shiroを使ってみた
第七回 渋谷Java - Apache Shiroを使ってみた第七回 渋谷Java - Apache Shiroを使ってみた
第七回 渋谷Java - Apache Shiroを使ってみた
chonaso
 
円環の理(Garbage Collection)
円環の理(Garbage Collection)円環の理(Garbage Collection)
円環の理(Garbage Collection)
Narihiro Nakamura
 
われわれは、GCをX倍遅くできる
われわれは、GCをX倍遅くできるわれわれは、GCをX倍遅くできる
われわれは、GCをX倍遅くできる
Narihiro Nakamura
 
第六回渋谷Java Java8のJVM監視を考える
第六回渋谷Java Java8のJVM監視を考える第六回渋谷Java Java8のJVM監視を考える
第六回渋谷Java Java8のJVM監視を考える
chonaso
 

Viewers also liked (20)

Javaのプログラムはどうやって動いているの? GC編
Javaのプログラムはどうやって動いているの? GC編Javaのプログラムはどうやって動いているの? GC編
Javaのプログラムはどうやって動いているの? GC編
 
GC FAQ
GC FAQGC FAQ
GC FAQ
 
Ruby's GC 20
Ruby's GC 20Ruby's GC 20
Ruby's GC 20
 
G1GCへ伸びていた「いばらの道」
G1GCへ伸びていた「いばらの道」G1GCへ伸びていた「いばらの道」
G1GCへ伸びていた「いばらの道」
 
Fxxking gc.c
Fxxking gc.cFxxking gc.c
Fxxking gc.c
 
GC黄金時代
GC黄金時代GC黄金時代
GC黄金時代
 
CRubyGCの並列世界
CRubyGCの並列世界CRubyGCの並列世界
CRubyGCの並列世界
 
RUBYLAND
RUBYLANDRUBYLAND
RUBYLAND
 
Java hotspot vmに おけるGCの振る舞い
Java hotspot vmにおけるGCの振る舞いJava hotspot vmにおけるGCの振る舞い
Java hotspot vmに おけるGCの振る舞い
 
GC本をGCしないための100の方法
GC本をGCしないための100の方法GC本をGCしないための100の方法
GC本をGCしないための100の方法
 
第七回 渋谷Java - Apache Shiroを使ってみた
第七回 渋谷Java - Apache Shiroを使ってみた第七回 渋谷Java - Apache Shiroを使ってみた
第七回 渋谷Java - Apache Shiroを使ってみた
 
Rubyによる本気のGC
Rubyによる本気のGCRubyによる本気のGC
Rubyによる本気のGC
 
円環の理(Garbage Collection)
円環の理(Garbage Collection)円環の理(Garbage Collection)
円環の理(Garbage Collection)
 
われわれは、GCをX倍遅くできる
われわれは、GCをX倍遅くできるわれわれは、GCをX倍遅くできる
われわれは、GCをX倍遅くできる
 
地獄のGC本スピンオフ
地獄のGC本スピンオフ地獄のGC本スピンオフ
地獄のGC本スピンオフ
 
Parallel worlds of CRuby's GC
Parallel worlds of CRuby's GCParallel worlds of CRuby's GC
Parallel worlds of CRuby's GC
 
第九回渋谷Java RaspberryPi+Javaを試してみる
第九回渋谷Java RaspberryPi+Javaを試してみる第九回渋谷Java RaspberryPi+Javaを試してみる
第九回渋谷Java RaspberryPi+Javaを試してみる
 
GCが止まらない
GCが止まらないGCが止まらない
GCが止まらない
 
第六回渋谷Java Java8のJVM監視を考える
第六回渋谷Java Java8のJVM監視を考える第六回渋谷Java Java8のJVM監視を考える
第六回渋谷Java Java8のJVM監視を考える
 
Java8勉強会
Java8勉強会Java8勉強会
Java8勉強会
 

Similar to Symbol GC

Symfony & Javascript. Combining the best of two worlds
Symfony & Javascript. Combining the best of two worldsSymfony & Javascript. Combining the best of two worlds
Symfony & Javascript. Combining the best of two worlds
Ignacio Martín
 

Similar to Symbol GC (20)

There and Back Again
There and Back AgainThere and Back Again
There and Back Again
 
SIL for the first time
SIL for the first timeSIL for the first time
SIL for the first time
 
SFO15-500: VIXL
SFO15-500: VIXLSFO15-500: VIXL
SFO15-500: VIXL
 
Hacking parse.y (RubyConf 2009)
Hacking parse.y (RubyConf 2009)Hacking parse.y (RubyConf 2009)
Hacking parse.y (RubyConf 2009)
 
A Blink Into The Rails Magic
A Blink Into The Rails MagicA Blink Into The Rails Magic
A Blink Into The Rails Magic
 
OSCON Presentation: Developing High Performance Websites and Modern Apps with...
OSCON Presentation: Developing High Performance Websites and Modern Apps with...OSCON Presentation: Developing High Performance Websites and Modern Apps with...
OSCON Presentation: Developing High Performance Websites and Modern Apps with...
 
Dip Your Toes in the Sea of Security (CoderCruise 2017)
Dip Your Toes in the Sea of Security (CoderCruise 2017)Dip Your Toes in the Sea of Security (CoderCruise 2017)
Dip Your Toes in the Sea of Security (CoderCruise 2017)
 
Deterministic Simulation - What modern online games can learn from the Game B...
Deterministic Simulation - What modern online games can learn from the Game B...Deterministic Simulation - What modern online games can learn from the Game B...
Deterministic Simulation - What modern online games can learn from the Game B...
 
Dip Your Toes in the Sea of Security (ConFoo YVR 2017)
Dip Your Toes in the Sea of Security (ConFoo YVR 2017)Dip Your Toes in the Sea of Security (ConFoo YVR 2017)
Dip Your Toes in the Sea of Security (ConFoo YVR 2017)
 
Encryption Boot Camp on the JVM
Encryption Boot Camp on the JVMEncryption Boot Camp on the JVM
Encryption Boot Camp on the JVM
 
Symfony & Javascript. Combining the best of two worlds
Symfony & Javascript. Combining the best of two worldsSymfony & Javascript. Combining the best of two worlds
Symfony & Javascript. Combining the best of two worlds
 
Building reusable libraries
Building reusable librariesBuilding reusable libraries
Building reusable libraries
 
ECMAScript.Next ECMAScipt 6
ECMAScript.Next ECMAScipt 6ECMAScript.Next ECMAScipt 6
ECMAScript.Next ECMAScipt 6
 
Es.next
Es.nextEs.next
Es.next
 
MMT 29: "Hab Dich!" -- Wie Angreifer ganz ohne JavaScript an Deine wertvollen...
MMT 29: "Hab Dich!" -- Wie Angreifer ganz ohne JavaScript an Deine wertvollen...MMT 29: "Hab Dich!" -- Wie Angreifer ganz ohne JavaScript an Deine wertvollen...
MMT 29: "Hab Dich!" -- Wie Angreifer ganz ohne JavaScript an Deine wertvollen...
 
The_Borrow_Checker.pdf
The_Borrow_Checker.pdfThe_Borrow_Checker.pdf
The_Borrow_Checker.pdf
 
Rust tutorial from Boston Meetup 2015-07-22
Rust tutorial from Boston Meetup 2015-07-22Rust tutorial from Boston Meetup 2015-07-22
Rust tutorial from Boston Meetup 2015-07-22
 
Serializing Ruby Objects in Redis
Serializing Ruby Objects in RedisSerializing Ruby Objects in Redis
Serializing Ruby Objects in Redis
 
Java Performance MythBusters
Java Performance MythBustersJava Performance MythBusters
Java Performance MythBusters
 
Sorbet at Grailed
Sorbet at GrailedSorbet at Grailed
Sorbet at Grailed
 

More from Narihiro Nakamura

シャイなRubyistがRubyKaigiでできること
シャイなRubyistがRubyKaigiでできることシャイなRubyistがRubyKaigiでできること
シャイなRubyistがRubyKaigiでできること
Narihiro Nakamura
 
GC生誕50周年を祝って
GC生誕50周年を祝ってGC生誕50周年を祝って
GC生誕50周年を祝って
Narihiro Nakamura
 
シャイなRubyistにできること
シャイなRubyistにできることシャイなRubyistにできること
シャイなRubyistにできること
Narihiro Nakamura
 
Railsハイパー実践講座-第35回NaCl勉強会
Railsハイパー実践講座-第35回NaCl勉強会Railsハイパー実践講座-第35回NaCl勉強会
Railsハイパー実践講座-第35回NaCl勉強会
Narihiro Nakamura
 
RubyのGC改善による私のエコライフ
RubyのGC改善による私のエコライフRubyのGC改善による私のエコライフ
RubyのGC改善による私のエコライフ
Narihiro Nakamura
 
本当は怖いObjectSpace.each_object
本当は怖いObjectSpace.each_object本当は怖いObjectSpace.each_object
本当は怖いObjectSpace.each_object
Narihiro Nakamura
 
Talk In Point Of Gc Once In While
Talk In Point Of Gc Once In WhileTalk In Point Of Gc Once In While
Talk In Point Of Gc Once In While
Narihiro Nakamura
 

More from Narihiro Nakamura (15)

桐島、Rubyやめるってよ
桐島、Rubyやめるってよ桐島、Rubyやめるってよ
桐島、Rubyやめるってよ
 
Parallel worlds of CRuby's GC
Parallel worlds of CRuby's GCParallel worlds of CRuby's GC
Parallel worlds of CRuby's GC
 
シャイなRubyistがRubyKaigiでできること
シャイなRubyistがRubyKaigiでできることシャイなRubyistがRubyKaigiでできること
シャイなRubyistがRubyKaigiでできること
 
GC生誕50周年を祝って
GC生誕50周年を祝ってGC生誕50周年を祝って
GC生誕50周年を祝って
 
GC本のツクリカタ
GC本のツクリカタGC本のツクリカタ
GC本のツクリカタ
 
シャイなRubyistにできること
シャイなRubyistにできることシャイなRubyistにできること
シャイなRubyistにできること
 
Railsハイパー実践講座-第35回NaCl勉強会
Railsハイパー実践講座-第35回NaCl勉強会Railsハイパー実践講座-第35回NaCl勉強会
Railsハイパー実践講座-第35回NaCl勉強会
 
Androidの中身-第26回NaCl社内勉強会
Androidの中身-第26回NaCl社内勉強会Androidの中身-第26回NaCl社内勉強会
Androidの中身-第26回NaCl社内勉強会
 
RubyのGC改善による私のエコライフ
RubyのGC改善による私のエコライフRubyのGC改善による私のエコライフ
RubyのGC改善による私のエコライフ
 
絶対復習について
絶対復習について絶対復習について
絶対復習について
 
AlgorithmDesign01
AlgorithmDesign01AlgorithmDesign01
AlgorithmDesign01
 
make of MiniGC
make of MiniGCmake of MiniGC
make of MiniGC
 
本当は怖いObjectSpace.each_object
本当は怖いObjectSpace.each_object本当は怖いObjectSpace.each_object
本当は怖いObjectSpace.each_object
 
Talk In Point Of Gc Once In While
Talk In Point Of Gc Once In WhileTalk In Point Of Gc Once In While
Talk In Point Of Gc Once In While
 
Rubyはゲームの夢を見るか
Rubyはゲームの夢を見るかRubyはゲームの夢を見るか
Rubyはゲームの夢を見るか
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Recently uploaded (20)

ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 

Symbol GC

  • 1. Symbol GC #rubykaigi 2014 Narihiro Nakamura - @nari3
  • 3. Self introduction ✔ Nari, @nari3, authorNari ✔ A CRuby committer. ✔ I work at NaCl. ✔ “Nakamura” ✔ is the most powerful clan in Ruby World.
  • 6. Today's topic obj = Object.new 100_000.times do |i| obj.respond_to?("sym#{i}".to_sym) end GC.start puts"symbol : #{Symbol.all_symbols.size}" $ ruby-2.1.2 a.rb symbol : 102416 $ ruby-trunk symbol : 2833
  • 8. Symbol ✔ A symbol is a primitive data type whose instances have a unique human-readable form. ✔ Symbols can be used as identifiers.
  • 10. A pitfall of Symbol
  • 11. A pitfall of Symbol ✔ All symbols are not garbage collected. ✔ Many beginners don't know this fact. ✔ Make a mistake even good rubyists. ✔ Prone to vulnerability ✔ User input → symbol ✔ Compress the memory
  • 12. Simple cases ✖ if user.respond_to(params[:method].to_sym) Is this method callable? NG: params[:method] is user input ✖ params[params[:attr].to_sym] Get a value of a hash via a symbol key. NG: params[:attr] is user input.
  • 13. Rails DoS Vulnerability CVE-2012-3424 HTTP Request: GET …. WWW-Authenticate: Digest digest = { to_sym :realm => “..”, to_sym :nonce => “..”, realm="..", nonce="...", algorithm=MD5, qop="auth" Parse to a hash } , foo=”xxx”, .., :foo => “..”, to_sym . . .,
  • 14. We want Symbol GC ✔ There is this request from long time ago. ✔ Sasada-san has an idea. ✔ I will implement this idea.
  • 15. Are symbols in other programming languages garbage collectable?
  • 16. Programming languages which supported for Symbol ✔ Too Many Parentheses Languages ✔ Erlang ✔ Smalltalk ✔ Scala
  • 17. Symbol GC support Language Symbol GC Erlang ✖ Gauche ✖ Clojure ○ EmacsLisp ✖ VisualWorks(Smalltalk) ○ Scala ○
  • 18. Implementation dependency? ✔ Not unified. ✔ Symbol GC is undocumented in programing language specifications. ✔ Implementation = Specification?
  • 19. EmacsLisp ✔ Function: unintern ✔ (unintern 'foo) ✔ Declare an unnecessary symbol. ✔ It's like manual memory management.
  • 20. Scala Java main.scala 01: val a = 'sym Symbol Table String “sym”
  • 21. Scala Java main.scala 01: val a = 'sym 02: a = null String “sym” Symbol Table Weak Reference GCG SCT ASRTATRT
  • 23. “sym”.to_sym C Ruby global_symbols ““ssyymm”” “sym” String sym_id(hash) ・ ・・ last_id(long) 1000 “sym” freeze freeze String “sym” 1001 Frozen String
  • 24. “sym”.to_sym C Ruby global_symbols ““ssyymm”” “sym” String sym_id(hash) 1001 ・ ・・ last_id(long) “sym” freeze freeze String “sym” 1001 ID: 1001 ID2SYM(ID) SYMBOL (VALUE) :sym Frozen String
  • 25. ID ✔ ID: Used by C Level. ✔ Store ID to a method table or a variable table. ✔ An unique number that corresponds to a symbol. ✔ Created by rb_intern(“foo”) of C API. ✔ :sym == :sym → 1001 == 1001
  • 26. SYMBOL(VALUE) ✔ SYMBOL(VALUE): Used by Ruby Level. ✔ An raw data of :sym or ”sym”.to_sym ✔ Uncollectable.
  • 27. Why can't collect garbage symbols.
  • 28. For example, it stores ID to the static area of the C extension C Ruby global_symbols sym_id(hash) “foo” 1001 ・ ・・ last_id(long) 1001 SYMBOL (VALUE) :foo Ruby's C extension static public ID id; SYM2ID(:foo) 1001
  • 29. If :foo is collected, ID in sym_id will be deleted. C Ruby global_symbols sym_id(hash) “foo” 1001 ・ ・・ last_id(long) 1001 SYMBOL (VALUE) :foo Ruby's C extension static public ID id; 1001 GC START
  • 30. Then “foo”.to_sym is called. :foo == :foo but different ID C Ruby global_symbols sym_id(hash) “foo” 1002 ・ ・・ last_id(long) 1001 SYMBOL (VALUE) :foo Ruby's C extension static public ID id; 1001 1002 Different SYM2ID(:foo) != id
  • 31. Why can't collect garbage symbols ✔ Problem: ID remaining in the C side. ✔ We can't detect and manage all IDs in C extension. ✔ Same symbol but different ID ✔ It will create an inconsistent ID.
  • 32. In Ruby world RRIIPP.. AA ssyymmbbooll iiss ddeeaadd...... Photo by MIKI Yoshihito, https://www.flickr.com/pphhoottooss//mmuujjiittrraa//77557711002222449900
  • 33. In C world WWRRRRRRYYYYYYYYYY!!!!!! II''mm ssttiillll aalliivvee........!! IIDD Photo by Zufallsfaktor, https://www.flickr.com/photos/zzuuffaallllssffaakkttoorr//55991111333388995599
  • 34. How do you create Symbol GC?
  • 35. Idea
  • 36. Separates into two types of symbols Immortal Symbol Mortal Symbol CC WWoorrlldd RRuubbyy WWoorrlldd
  • 37. Immortal Symbol ✔ These symbols have the ID corresponding ✔ e.g. method name, variable name, constant name, etc... ✔ use in C-level mainly ✔ Uncollectable ✔ Symbol stay alive after numbering the ID once ✔ There is no transition to Mortal Symbol.
  • 38. def foo; end C Ruby global_symbols sym_id(hash) “foo” 1001 ・ ・・ last_id(long) 1001 Frozen String “foo”
  • 39. Store an ID to the method table C Ruby global_symbols sym_id(hash) “foo” 1001 ・ ・・ last_id(long) 1001 Frozen String “foo” Method table 1001 def foo; end
  • 40. ID2SYM(ID) → VALUE C Ruby global_symbols sym_id(hash) “foo” 1001 ・ ・・ last_id(long) 1001 “foo” ID: 1001 ID2SYM(ID) Immortal Symbol (VALUE) :foo Frozen String Method table 1001 def foo; end
  • 41. Mortal Symbol ✔ These symbols don't have ID ✔ “sym”.to_sym → Mortal Symbol ✔ use in Ruby-level mainly ✔ Collectable ✔ Unreachable symbols are collected. ✔ There is transition to Immortal Symbol.
  • 42. “bar”.to_sym C Ruby global_symbols sym_id(hash) “foo” 1001 ・ ・・ last_id(long) 1001 “foo” ID: 1001 ID2SYM(ID) Immortal Symbol (VALUE) :foo Frozen String Mortal Symbol (VALUE) :bar Frozen String “bar” “bar” :bar
  • 43. Splits uncollectable or collectable objects C Ruby global_symbols sym_id(hash) “foo” 1001 ・ ・・ last_id(long) 1001 “foo” ID: 1001 Uncollectable ID2SYM(ID) Immortal Symbol (VALUE) :foo Frozen String Mortal Symbol (VALUE) :bar Collectable Frozen String “bar” “bar” :bar
  • 44. :bar will be collected C Ruby global_symbols sym_id(hash) “foo” 1001 ・ ・・ last_id(long) 1001 “foo” ID: 1001 ID2SYM(ID) Immortal Symbol (VALUE) :foo Frozen String Mortal Symbol (VALUE) :bar Frozen String “bar” “bar” :bar
  • 45. If you already have Immortal Symbol of the same name
  • 46. def foo; end C Ruby global_symbols sym_id(hash) “foo” 1001 ・ ・・ last_id(long) 1001 “foo” ID: 1001 ID2SYM(ID) Immortal Symbol (VALUE) :foo Frozen String
  • 47. “foo”.to_sym C Ruby global_symbols sym_id(hash) “foo” 1001 ・ ・・ last_id(long) 1001 “foo” ID:1001 ID2SYM(ID) Immortal Symbol (VALUE) :foo Frozen String Mortal Symbol (VALUE) :foo Check Use this one
  • 48. From Mortal Symbol to Immortal Symbol
  • 49. define_method(“foo”.to_sym){} C Ruby Mortal Symbol (VALUE) :foo global_symbols sym_id(hash) “foo” :foo
  • 50. define_method(“foo”.to_sym){} C Ruby Immortal Mortal Symbol (VALUE) :foo global_symbols sym_id(hash) Method table 0x2c8d0 def foo; end SYM2ID(VALUE) 0x2c8d0 Pin down UUnnccoolllleeccttaabbllee Address = ID “foo” :foo
  • 52. A new pitfall is coming!
  • 53. Immortal Symbol ✔ All symbols are garbage collected. ✔ Immortal symbols are not garbage collected. ✔ Mortal → Immortal symbol when numbering an ID. ✔ This still lead to vulnerability!
  • 54. A new pitfall ✔ Immortal Symbol is increase unintentionally. ✔ For instance: Get a name from a symbol ✔ rb_id2str(SYM2ID(sym)) ✔ Mortal → Immortal ✔ Please use rb_sym2str() ✔ Please attention to unconsidered SYM2ID().
  • 55. Please keep to monitor ✔ Check Symbol.all_symbols.size ✔ Please report a bug to ruby-core or library author if increase number of symbols. ✔ It's a transition period now. ✔ It will get better gradually.
  • 56. Details of implementation (for CRuby Hackers)
  • 57. Static Symbol, Dynamic Symbol ✔ Static Symbol = Immediate value ✔ Immortal ✔ Dynamic Symbol = RVALUE ✔ Mortal or Immortal ✔ Change to immortal symbol when needs ID. ✔ Similar to Float and FLONUM
  • 58. Details of RSymbol struct struct RSymbol { struct RBasic basic; VALUE fstr; ID type; }; Frozen String “foo” ID_LOCAL 0b00000 ID_INSTANCE 0b00010 ID_GLOBAL 0b00110 ID_ATTRSET 0b01000 ・・ ・
  • 59. ID Structure 0bxxx.....xxx 000 High-order 61 bits = Counter Low-order 3 bits = ID type 0bxxx.....xx 000 1 Low-order 1 bit = Static Symbol Flag
  • 60. Fast recognize ID ✔ Low-order 1bit = 1 → Static Symbol ✔ Dynamic Symbol ID = RVALUE address ✔ Low order 1 bit = 0 ✔ It's only check of the lower 1 bit.
  • 62. Conclusion ✔ Most symbols will be garbage collected. ✔ But some symbols won't be garbage collected. ✔ “sym”.to_sym → OK ✔ define_method(“sym”.to_sym){} → NG
  • 63. Acknowledgments ✔ Sasada-san ✔ Teaches me an idea of Symbol GC. ✔ Refines code of Symbol GC. ✔ Nakada-san, Tsujimoto-san, U.Nakamura-san, etc... ✔ Fixes many bugs. ✔ NaCl members