SlideShare a Scribd company logo
1 of 131
Operational
Efļ¬ciency
Hacks
                          John Allspaw
          Operations Engineering, Flickr
who am I?
Manage the Flickr Operations group
Wrote a geeky book:
ā€œEfļ¬cienciesā€
ā€œEfļ¬cienciesā€
   Doing more with the robots youā€™ve got
ā€œEfļ¬cienciesā€
   Doing more with the robots youā€™ve got
   Doing more with the humans youā€™ve got
Some optimization
    ā€œrulesā€
Some optimization
         ā€œrulesā€

-   Donā€™t rely on being able to tweak anything.
Some optimization
         ā€œrulesā€

-   Donā€™t rely on being able to tweak anything.
-   Donā€™t waste too much time tuning when
    you have no evidence itā€™ll matter.
Optimization ā€œrulesā€
performance tuning gains




                                 time spent tuning
Optimization ā€œrulesā€




ā€œWe should forget about small efļ¬ciencies,
say about 97% of the time: premature
optimization is the root of all evil.ā€
                         Knuth, (or Hoare)
however...
Optimization ā€œrulesā€
Optimization ā€œrulesā€

That doesnā€™t give us an excuse to be lazy and
inefļ¬cient.
Optimization ā€œrulesā€

That doesnā€™t give us an excuse to be lazy and
inefļ¬cient.
Optimization ā€œrulesā€

That doesnā€™t give us an excuse to be lazy and
inefļ¬cient.


We lean on the experience of people in the
community for evidence that tuning(s) might
be a worthwhile thing to do.
Optimization ā€œrulesā€

ā€œYet we should not pass up our
opportunities in that critical 3 percent.ā€
                       Knuth, (or Hoare)
So...
          stop somewhere
               in here




                           OMG
obvious
                           I'm wasting
tuning
                           !@#$ time
wins
                           for no reason
Our Context
Our Context

-   24 TB of MySQL data
Our Context

-   24 TB of MySQL data
-   32k/sec of MySQL writes
Our Context

-   24 TB of MySQL data
-   32k/sec of MySQL writes
-   120k/sec of MySQL reads
Our Context

-   24 TB of MySQL data
-   32k/sec of MySQL writes
-   120k/sec of MySQL reads
-   6 PB of photos
Our Context

-   24 TB of MySQL data
-   32k/sec of MySQL writes
-   120k/sec of MySQL reads
-   6 PB of photos
-   10TB storage eaten per day
Our Context

-   24 TB of MySQL data
-   32k/sec of MySQL writes
-   120k/sec of MySQL reads
-   6 PB of photos
-   10TB storage eaten per day
-   15,362 service monitors (alerts)
Infrastructure Hacks
Infrastructure Hacks

-   Examples of what changing software can do
          (plain old-fashioned performance tuning)
Infrastructure Hacks

-   Examples of what changing software can do
          (plain old-fashioned performance tuning)
-   Examples of what changing hardware can do
                              (yay for Mr. Moore!)
Leaning on compilers
(synthetic PHP benchmarks, not real-world)




(http://sebastian-bergmann.de/archives/634-PHP-GCC-
                  ICC-Benchmark.html)
PHP (real-world)




php 4.4.8 to php 5.2.8 migration
Can now handle more with less




same
taste,
 less
ļ¬lling
Image Processing
Image Processing


-   2004, Flickr was using ImageMagick for
    image processing (version 6.1.9)
Image Processing


-   2004, Flickr was using ImageMagick for
    image processing (version 6.1.9)
-   Changed to GraphicsMagick, about 15%
    faster at the time (version 1.1.5)
Image Processing


-   2004, Flickr was using ImageMagick for
    image processing (version 6.1.9)
-   Changed to GraphicsMagick, about 15%
    faster at the time (version 1.1.5)
-   Only need a subset of ImageMagick features
    anyway for our purposes
Image Processing

-   OpenMP support
    (http://en.wikipedia.org/wiki/Openmp)
    - Allows parallelization of processing jobs,
       using multiple cores working on the same
       image
    - Some algorithms have more parallelization
       than others
Image Processing
-   Test script
     -   7 large-ish DSLR photos
     -   Cascade resizing each to 6 smaller sizes,
         semi-typical for Flickrā€™s workload
     -   Each resize processed serially
Image Processing
   compiler differences




(GM version 1.1.14, non-OpenMP)
Image Processing
          OpenMP differences


                    OpenMP
                    advantage




(gcc 4.1.2, on quad core Xeon L5335 @ 2.00GHz)
Image Processing
   CPU differences
Diagonal Scaling
Diagonal Scaling
-   Vertically scaling your already horizontally-
    scaled nodes
Diagonal Scaling
-   Vertically scaling your already horizontally-
    scaled nodes
-   a.k.a. ā€œtech refreshā€
Diagonal Scaling
-   Vertically scaling your already horizontally-
    scaled nodes
-   a.k.a. ā€œtech refreshā€
-   a.k.a. ā€œMooreā€™s Law Surļ¬ngā€
Diagonal Scaling
Diagonal Scaling
              67 ā€œoldā€ webservers with 18 ā€œnewā€ :
We replaced
Diagonal Scaling
              67 ā€œoldā€ webservers with 18 ā€œnewā€ :
We replaced


                CPUs         RAM         drives       total power (W)
servers                                                 @60% peak
               per server   per server   per server

   67              2                                    8763.6
                              4GB        1x80GB

   18              8                                    2332.8
                             4GB         1x146GB
Diagonal Scaling
              67 ā€œoldā€ webservers with 18 ā€œnewā€ :
We replaced


                CPUs         RAM         drives       total power (W)
servers                                                 @60% peak

              ~70% LESS power
               per server   per server   per server

   67           2                                       8763.6
                     4GB    1x80GB
              49U LESS rack space
   18              8                                    2332.8
                             4GB         1x146GB
Diagonal Scaling
Diagonal Scaling
              23 ā€œoldā€ image processing boxes with 8 ā€œnewā€
We replaced
Diagonal Scaling
              23 ā€œoldā€ image processing boxes with 8 ā€œnewā€
We replaced


   server        photos/min        rack        total power (W)
                                                 @60% peak



      23             1035           23          3008.4
      8              1120            8          1036.8
Diagonal Scaling
              23 ā€œoldā€ image processing boxes with 8 ā€œnewā€
We replaced


   server        photos/min        rack        total power (W)
                                                 @60% peak

                  ~75% FASTER
      23            1035         23             3008.4
                  15U LESS rack space
                  65% LESS power 8
       8            1120                        1036.8
Diagonal Scaling
              23 ā€œoldā€ image processing boxes with 8 ā€œnewā€
We replaced


   server        photos/min        rack        total power (W)

                  ~75% FASTER                    @60% peak



                  15U LESS rack space
      23            1035         23             3008.4
                  65% LESS power 8
       8            1120                        1036.8
Diagonal Scaling
              23 ā€œoldā€ image processing boxes with 8 ā€œnewā€
We replaced


   server        photos/min        rack        total power (W)

                  ~75% FASTER                    @60% peak



                  15U LESS rack space
      23            1035         23             3008.4
                  65% LESS power 8
       8            1120                        1036.8
Diagonal Scaling
              23 ā€œoldā€ image processing boxes with 8 ā€œnewā€
We replaced


   server        photos/min        rack        total power (W)

                  ~75% FASTER                    @60% peak



                  15U LESS rack space
      23            1035         23             3008.4
                  65% LESS power 8
       8            1120                        1036.8

        from this
                           to this
What do you do with
old/slow machines?
What do you do with
    old/slow machines?

-   Liquidate
What do you do with
    old/slow machines?

-   Liquidate
-   Re-purpose as dev/staging/etc
What do you do with
    old/slow machines?

-   Liquidate
-   Re-purpose as dev/staging/etc
-   ā€œofļ¬‚ineā€ tasks
Ofļ¬‚ine Tasks
Ofļ¬‚ine Tasks
-   Out-of-band/asynchronous queuing and execution
    system, for non-realtime tasks
Ofļ¬‚ine Tasks
-   Out-of-band/asynchronous queuing and execution
    system, for non-realtime tasks
-   See here:
Ofļ¬‚ine Tasks
-   Out-of-band/asynchronous queuing and execution
    system, for non-realtime tasks
-   See here:
    http://code.ļ¬‚ickr.com/blog/2008/09/26/ļ¬‚ickr-engineers-do-it-ofļ¬‚ine/
Ofļ¬‚ine Tasks
-   Out-of-band/asynchronous queuing and execution
    system, for non-realtime tasks
-   See here:
    http://code.ļ¬‚ickr.com/blog/2008/09/26/ļ¬‚ickr-engineers-do-it-ofļ¬‚ine/

-   See Myles Grant talk about it more here:
Ofļ¬‚ine Tasks
-   Out-of-band/asynchronous queuing and execution
    system, for non-realtime tasks
-   See here:
    http://code.ļ¬‚ickr.com/blog/2008/09/26/ļ¬‚ickr-engineers-do-it-ofļ¬‚ine/

-   See Myles Grant talk about it more here:
    http://en.oreilly.com/velocity2009/public/schedule/detail/7552
Runbook Hacks




ā€œWTF HAPPENED LAST NIGHT?!ā€
Why?
Why?
As infrastructure grows, try to keep the
Humans:Machines ratio from getting out of
hand
Why?
As infrastructure grows, try to keep the
Humans:Machines ratio from getting out of
hand
Some of the How:
Why?
As infrastructure grows, try to keep the
Humans:Machines ratio from getting out of
hand
Some of the How:
  - teach machines to build themselves
Why?
As infrastructure grows, try to keep the
Humans:Machines ratio from getting out of
hand
Some of the How:
  - teach machines to build themselves
  - teach machines to watch themselves
Why?
As infrastructure grows, try to keep the
Humans:Machines ratio from getting out of
hand
Some of the How:
  - teach machines to build themselves
  - teach machines to watch themselves
  - teach machines to ļ¬x themselves
Why?
As infrastructure grows, try to keep the
Humans:Machines ratio from getting out of
hand
Some of the How:
  -   teach machines to build themselves
  -   teach machines to watch themselves
  -   teach machines to ļ¬x themselves
  -   reduce MTTR by streamlining
Automated
Infrastructure
Automated
          Infrastructure
-   If there is only one thing you do, automatic
    conļ¬guration and deployment management
    should be it.
Automated
              Infrastructure
-   If there is only one thing you do, automatic
    conļ¬guration and deployment management
    should be it.
-   See:
     -     Opscode/Chef (http://opscode.com/)

     -     Puppet (http://reductivelabs.com/products/puppet/)

     -     System Imager/Conļ¬gurator
           (http://wiki.systemimager.org)
Conguration Management
     Codeswarm
Time
Machine time is cheaper than human time.


If a failure results in some commands being
run to ā€˜ļ¬xā€™ it, make the machines do it.


(i.e., donā€™t wake people up for stupid things!)
Aggregate Monitoring
Aggregate Monitoring




Donā€™t care about single nodes, only care about delta
change of metrics/faults
   -   Warn (email) on X % change
   -   Page (wake up) on Y % change
Aggregate Monitoring




Donā€™t care about single nodes, only care about delta
change of metrics/faults
   -   Warn (email) on X % change
   -   Page (wake up) on Y % change
High and low water marks for some metrics
Self-Healing
Self-Healing
Make service monitoring ļ¬x common failure
scenarios, notify us later about it.
Self-Healing
Make service monitoring ļ¬x common failure
scenarios, notify us later about it.
Self-Healing
Make service monitoring ļ¬x common failure
scenarios, notify us later about it.


Daemons/processes run on machines, will take
corrective action under certain conditions, and
report back with what they did.
Self-Healing
Make service monitoring ļ¬x common failure
scenarios, notify us later about it.


Daemons/processes run on machines, will take
corrective action under certain conditions, and
report back with what they did.
Self-Healing
Make service monitoring ļ¬x common failure
scenarios, notify us later about it.


Daemons/processes run on machines, will take
corrective action under certain conditions, and
report back with what they did.


Can greatly reduce your mean time to recovery
(MTTR)
Self-Healing
Make service monitoring ļ¬x common failure
scenarios, notify us later about it.


Daemons/processes run on machines, will take
corrective action under certain conditions, and
report back with what they did.


Can greatly reduce your mean time to recovery
(MTTR)
Basic Apache Example
Basic Apache Example

1. Webserver not running?
Basic Apache Example

1. Webserver not running?
2. Under certain conditions, try to start it, and
   email that this happened. (Iā€™ll read it
   tomorrow)
Basic Apache Example

1. Webserver not running?
2. Under certain conditions, try to start it, and
   email that this happened. (Iā€™ll read it
   tomorrow)
3. Wonā€™t start? Assume somethingā€™s really
   wrong, so donā€™t keep trying (email that, too)
MySQL Self-Healing
MySQL Self-Healing
Some MySQL Issues ā€œļ¬xedā€ by the machines
MySQL Self-Healing
Some MySQL Issues ā€œļ¬xedā€ by the machines
MySQL Self-Healing
   Some MySQL Issues ā€œļ¬xedā€ by the machines


- Kill long-running SELECT queries (marked safe to kill)
MySQL Self-Healing
   Some MySQL Issues ā€œļ¬xedā€ by the machines


- Kill long-running SELECT queries (marked safe to kill)
- Queries not safe to kill are marked by the application
   as ā€œNO KILLā€ in comments
MySQL Self-Healing
   Some MySQL Issues ā€œļ¬xedā€ by the machines


- Kill long-running SELECT queries (marked safe to kill)
- Queries not safe to kill are marked by the application
   as ā€œNO KILLā€ in comments
- Run EXPLAIN on killed queries, and report the results
MySQL Self-Healing
   Some MySQL Issues ā€œļ¬xedā€ by the machines


- Kill long-running SELECT queries (marked safe to kill)
- Queries not safe to kill are marked by the application
   as ā€œNO KILLā€ in comments
- Run EXPLAIN on killed queries, and report the results
- Keep track of the query types and databases that need
   the most killing, produce a ā€œDBs that Suckā€ report
MySQL Self-Healing
MySQL Self-Healing
Some MySQL Replication issues ā€œļ¬xedā€ by the
machines, by error
MySQL Self-Healing
  Some MySQL Replication issues ā€œļ¬xedā€ by the
  machines, by error
- Skip errors that can safely be skipped and
  restart slave threads
MySQL Self-Healing
  Some MySQL Replication issues ā€œļ¬xedā€ by the
  machines, by error
- Skip errors that can safely be skipped and
  restart slave threads
- Force refetch of replication binlogs on:
       - 1064 (ER_PARSE_ERROR)
MySQL Self-Healing
  Some MySQL Replication issues ā€œļ¬xedā€ by the
  machines, by error
- Skip errors that can safely be skipped and
  restart slave threads
- Force refetch of replication binlogs on:
       - 1064 (ER_PARSE_ERROR)
- Re-run queries on:
       - 1205 (ER_LOCK_WAIT_TIMEOUT)
       - 1213 (ER_LOCK_DEADLOCK)
Troubleshooting
Code and Conļ¬g
 Deploy Logs
Code and Conļ¬g
 Deploy Logs

1. ESSENTIAL
Code and Conļ¬g
 Deploy Logs

1. ESSENTIAL
2. MANDATORY
Communications
Communications
ā€¢   Internal IRC

     -   For ongoing discussions
     -   Logged, so ā€œinļ¬niteā€ scrollback
Communications
ā€¢   Internal IRC

      -   For ongoing discussions
      -   Logged, so ā€œinļ¬niteā€ scrollback
ā€¢   IM Bot (built on libyahoo2.sf.net)

      -   For production changes
      -   Broadcasts all to all contacts
      -   Logged, and injected into IRC
      -   IM Status = who is in primary/secondary on-call
Communications
ā€¢   Internal IRC

      -   For ongoing discussions
      -   Logged, so ā€œinļ¬niteā€ scrollback
ā€¢   IM Bot (built on libyahoo2.sf.net)

      -   For production changes
      -   Broadcasts all to all contacts
      -   Logged, and injected into IRC
      -   IM Status = who is in primary/secondary on-call

ā€¢   All of IRC and IM Bot slurped into a search index
when
when   what
when   what

              detailed
               what*
when                 what

                                                        detailed
                                                         what*




*also points to what commands should be used to back out the changes
when                 what

                                                        detailed
                                                         what*



                                          who




*also points to what commands should be used to back out the changes
when                 what

                                                        detailed
                                                         what*



                                          who




*also points to what commands should be used to back out the changes
when                 what

                                                        detailed
                                                         what*



                                          who
     time of last deploy at top of ganglia




*also points to what commands should be used to back out the changes
IM Bot (timestamps help correlation)
IM Bot (timestamps help correlation)
IM Bot (timestamps help correlation)




all IRC, IM
      bot
     into
searchable
  history
Morals of Our Stories
Morals of Our Stories

-   Optimizations can be a Very Good Thingā„¢
Morals of Our Stories

-   Optimizations can be a Very Good Thingā„¢
-   Weigh time spent optimizing against
    expected gains
Morals of Our Stories

-   Optimizations can be a Very Good Thingā„¢
-   Weigh time spent optimizing against
    expected gains
-   Lean on others for how much ā€œexpected
    gainsā€ mean for different scenarios
Morals of Our Stories

-   Optimizations can be a Very Good Thingā„¢
-   Weigh time spent optimizing against
    expected gains
-   Lean on others for how much ā€œexpected
    gainsā€ mean for different scenarios
-   Plain old-fashioned intuition
Some Wisdom Nuggets


     Jon Prallā€™s 85 WebOps Rules:
  http://jprall.vox.com/library/post/85-
    operations-rules-to-live-by.html
Questions?

http://www.ļ¬‚ickr.com/photos/ebarney/3348965637/
http://www.ļ¬‚ickr.com/photos/dgmiller/1606071911/
http://www.ļ¬‚ickr.com/photos/dannyboyster/60371673/
http://www.ļ¬‚ickr.com/photos/bright/189338394/
http://www.ļ¬‚ickr.com/photos/nickwheeleroz/2475011402/
http://www.ļ¬‚ickr.com/photos/dramaqueennorma/191063346/
http://www.ļ¬‚ickr.com/photos/telstar/2861103147/
http://www.ļ¬‚ickr.com/photos/norby/2309046043/
http://www.ļ¬‚ickr.com/photos/allysonk/201008992/

More Related Content

What's hot

Image Restoration
Image Restoration Image Restoration
Image Restoration Mahmudul Hasan
Ā 
Digital watermarking
Digital watermarkingDigital watermarking
Digital watermarkingAnkush Kr
Ā 
Testing web apps Ų§Ų®ŲŖŲØŲ§Ų± ŲŖŲ·ŲØŁŠŁ‚Ų§ŲŖ Ų§Ł„ŁˆŁŠŲØ
Testing web apps Ų§Ų®ŲŖŲØŲ§Ų± ŲŖŲ·ŲØŁŠŁ‚Ų§ŲŖ Ų§Ł„ŁˆŁŠŲØTesting web apps Ų§Ų®ŲŖŲØŲ§Ų± ŲŖŲ·ŲØŁŠŁ‚Ų§ŲŖ Ų§Ł„ŁˆŁŠŲØ
Testing web apps Ų§Ų®ŲŖŲØŲ§Ų± ŲŖŲ·ŲØŁŠŁ‚Ų§ŲŖ Ų§Ł„ŁˆŁŠŲØEhab Saad Ahmad
Ā 
SSII2021 [SS2] Deepfake Generation and Detection ā€“ An Overview ļ¼ˆćƒ‡ć‚£ćƒ¼ćƒ—ćƒ•ć‚§ć‚¤ć‚Æ恮ē”ŸęˆćØꤜå‡ŗļ¼‰
SSII2021 [SS2] Deepfake Generation and Detection ā€“ An Overview ļ¼ˆćƒ‡ć‚£ćƒ¼ćƒ—ćƒ•ć‚§ć‚¤ć‚Æ恮ē”ŸęˆćØꤜå‡ŗļ¼‰SSII2021 [SS2] Deepfake Generation and Detection ā€“ An Overview ļ¼ˆćƒ‡ć‚£ćƒ¼ćƒ—ćƒ•ć‚§ć‚¤ć‚Æ恮ē”ŸęˆćØꤜå‡ŗļ¼‰
SSII2021 [SS2] Deepfake Generation and Detection ā€“ An Overview ļ¼ˆćƒ‡ć‚£ćƒ¼ćƒ—ćƒ•ć‚§ć‚¤ć‚Æ恮ē”ŸęˆćØꤜå‡ŗļ¼‰SSII
Ā 
Adversarial Attacks and Defense
Adversarial Attacks and DefenseAdversarial Attacks and Defense
Adversarial Attacks and DefenseKishor Datta Gupta
Ā 
Fake news detection project
Fake news detection projectFake news detection project
Fake news detection projectHarshdaGhai
Ā 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and ApplicationsEmanuele Ghelfi
Ā 
Genetic algorithm for hyperparameter tuning
Genetic algorithm for hyperparameter tuningGenetic algorithm for hyperparameter tuning
Genetic algorithm for hyperparameter tuningDr. Jyoti Obia
Ā 
Python project on Image Based Captcha
Python project on Image Based CaptchaPython project on Image Based Captcha
Python project on Image Based CaptchaKAUSHAL KUMAR JHA
Ā 
Python for Computer Vision - Revision 2nd Edition
Python for Computer Vision - Revision 2nd EditionPython for Computer Vision - Revision 2nd Edition
Python for Computer Vision - Revision 2nd EditionAhmed Gad
Ā 
Generative Adversarial Network (GAN)
Generative Adversarial Network (GAN)Generative Adversarial Network (GAN)
Generative Adversarial Network (GAN)Prakhar Rastogi
Ā 
"Understanding and Implementing Face Landmark Detection and Tracking," a Pres...
"Understanding and Implementing Face Landmark Detection and Tracking," a Pres..."Understanding and Implementing Face Landmark Detection and Tracking," a Pres...
"Understanding and Implementing Face Landmark Detection and Tracking," a Pres...Edge AI and Vision Alliance
Ā 
Computer vision, machine, and deep learning
Computer vision, machine, and deep learningComputer vision, machine, and deep learning
Computer vision, machine, and deep learningIgi Ardiyanto
Ā 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networksYunjey Choi
Ā 

What's hot (20)

Image Restoration
Image Restoration Image Restoration
Image Restoration
Ā 
JPEG
JPEGJPEG
JPEG
Ā 
Gan intro
Gan introGan intro
Gan intro
Ā 
Digital watermarking
Digital watermarkingDigital watermarking
Digital watermarking
Ā 
Testing web apps Ų§Ų®ŲŖŲØŲ§Ų± ŲŖŲ·ŲØŁŠŁ‚Ų§ŲŖ Ų§Ł„ŁˆŁŠŲØ
Testing web apps Ų§Ų®ŲŖŲØŲ§Ų± ŲŖŲ·ŲØŁŠŁ‚Ų§ŲŖ Ų§Ł„ŁˆŁŠŲØTesting web apps Ų§Ų®ŲŖŲØŲ§Ų± ŲŖŲ·ŲØŁŠŁ‚Ų§ŲŖ Ų§Ł„ŁˆŁŠŲØ
Testing web apps Ų§Ų®ŲŖŲØŲ§Ų± ŲŖŲ·ŲØŁŠŁ‚Ų§ŲŖ Ų§Ł„ŁˆŁŠŲØ
Ā 
DIP - Image Restoration
DIP - Image RestorationDIP - Image Restoration
DIP - Image Restoration
Ā 
Moving object detection
Moving object detectionMoving object detection
Moving object detection
Ā 
SSII2021 [SS2] Deepfake Generation and Detection ā€“ An Overview ļ¼ˆćƒ‡ć‚£ćƒ¼ćƒ—ćƒ•ć‚§ć‚¤ć‚Æ恮ē”ŸęˆćØꤜå‡ŗļ¼‰
SSII2021 [SS2] Deepfake Generation and Detection ā€“ An Overview ļ¼ˆćƒ‡ć‚£ćƒ¼ćƒ—ćƒ•ć‚§ć‚¤ć‚Æ恮ē”ŸęˆćØꤜå‡ŗļ¼‰SSII2021 [SS2] Deepfake Generation and Detection ā€“ An Overview ļ¼ˆćƒ‡ć‚£ćƒ¼ćƒ—ćƒ•ć‚§ć‚¤ć‚Æ恮ē”ŸęˆćØꤜå‡ŗļ¼‰
SSII2021 [SS2] Deepfake Generation and Detection ā€“ An Overview ļ¼ˆćƒ‡ć‚£ćƒ¼ćƒ—ćƒ•ć‚§ć‚¤ć‚Æ恮ē”ŸęˆćØꤜå‡ŗļ¼‰
Ā 
Adversarial Attacks and Defense
Adversarial Attacks and DefenseAdversarial Attacks and Defense
Adversarial Attacks and Defense
Ā 
Fake news detection project
Fake news detection projectFake news detection project
Fake news detection project
Ā 
GAN - Theory and Applications
GAN - Theory and ApplicationsGAN - Theory and Applications
GAN - Theory and Applications
Ā 
Genetic algorithm for hyperparameter tuning
Genetic algorithm for hyperparameter tuningGenetic algorithm for hyperparameter tuning
Genetic algorithm for hyperparameter tuning
Ā 
Python project on Image Based Captcha
Python project on Image Based CaptchaPython project on Image Based Captcha
Python project on Image Based Captcha
Ā 
Python for Computer Vision - Revision 2nd Edition
Python for Computer Vision - Revision 2nd EditionPython for Computer Vision - Revision 2nd Edition
Python for Computer Vision - Revision 2nd Edition
Ā 
Generative Adversarial Network (GAN)
Generative Adversarial Network (GAN)Generative Adversarial Network (GAN)
Generative Adversarial Network (GAN)
Ā 
Fractal Image Compression
Fractal Image CompressionFractal Image Compression
Fractal Image Compression
Ā 
"Understanding and Implementing Face Landmark Detection and Tracking," a Pres...
"Understanding and Implementing Face Landmark Detection and Tracking," a Pres..."Understanding and Implementing Face Landmark Detection and Tracking," a Pres...
"Understanding and Implementing Face Landmark Detection and Tracking," a Pres...
Ā 
Computer vision, machine, and deep learning
Computer vision, machine, and deep learningComputer vision, machine, and deep learning
Computer vision, machine, and deep learning
Ā 
Image Stitching for Panorama View
Image Stitching for Panorama ViewImage Stitching for Panorama View
Image Stitching for Panorama View
Ā 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
Ā 

Viewers also liked

Awareness, Feedback, Self-regulation
Awareness, Feedback, Self-regulationAwareness, Feedback, Self-regulation
Awareness, Feedback, Self-regulationChristian Glahn
Ā 
Go or No-Go: Operability and Contingency Planning at Etsy.com
Go or No-Go: Operability and Contingency Planning at Etsy.comGo or No-Go: Operability and Contingency Planning at Etsy.com
Go or No-Go: Operability and Contingency Planning at Etsy.comJohn Allspaw
Ā 
ProgramaĆ§Ć£o semana de extensĆ£o 2012 folia
ProgramaĆ§Ć£o semana de extensĆ£o 2012 foliaProgramaĆ§Ć£o semana de extensĆ£o 2012 folia
ProgramaĆ§Ć£o semana de extensĆ£o 2012 foliaAdriana Rocha
Ā 
Reconociendo habilidades
Reconociendo habilidadesReconociendo habilidades
Reconociendo habilidadescarorubi1
Ā 
Saludo protocolar al cuerpo diplomƔtico acreditado en panamƔ
Saludo protocolar al cuerpo diplomƔtico acreditado en panamƔSaludo protocolar al cuerpo diplomƔtico acreditado en panamƔ
Saludo protocolar al cuerpo diplomƔtico acreditado en panamƔmirepanama
Ā 
Pelleting: The link between practice and engineering
Pelleting: The link between practice and engineeringPelleting: The link between practice and engineering
Pelleting: The link between practice and engineeringMilling and Grain magazine
Ā 
Annual Subsea Meeting Registration Form
Annual Subsea Meeting Registration FormAnnual Subsea Meeting Registration Form
Annual Subsea Meeting Registration FormPaulo de Tarso Molina
Ā 
Strategies for Vacant Properties - John Mills, Camelot Property Management
Strategies for Vacant Properties - John Mills, Camelot Property ManagementStrategies for Vacant Properties - John Mills, Camelot Property Management
Strategies for Vacant Properties - John Mills, Camelot Property Managementfhanley
Ā 
FLOW - Far and Large Offshore Wind (Summary)
FLOW - Far and Large Offshore Wind (Summary)FLOW - Far and Large Offshore Wind (Summary)
FLOW - Far and Large Offshore Wind (Summary)NLandUSA
Ā 
B2b newsletter_april
B2b newsletter_aprilB2b newsletter_april
B2b newsletter_aprilRene Jack
Ā 
Cyclus-US Portfolio
Cyclus-US PortfolioCyclus-US Portfolio
Cyclus-US PortfolioXDS Marketing.
Ā 
La gestiĆ³n de la elecciĆ³n gd e e mail
La gestiĆ³n de la elecciĆ³n gd e e mailLa gestiĆ³n de la elecciĆ³n gd e e mail
La gestiĆ³n de la elecciĆ³n gd e e mailmsanchezm
Ā 
Aguahidratacion cuso4b
Aguahidratacion cuso4bAguahidratacion cuso4b
Aguahidratacion cuso4bJ M
Ā 
Final.gest.hum.1
Final.gest.hum.1Final.gest.hum.1
Final.gest.hum.1edwin montero
Ā 

Viewers also liked (20)

Awareness, Feedback, Self-regulation
Awareness, Feedback, Self-regulationAwareness, Feedback, Self-regulation
Awareness, Feedback, Self-regulation
Ā 
Go or No-Go: Operability and Contingency Planning at Etsy.com
Go or No-Go: Operability and Contingency Planning at Etsy.comGo or No-Go: Operability and Contingency Planning at Etsy.com
Go or No-Go: Operability and Contingency Planning at Etsy.com
Ā 
ProgramaĆ§Ć£o semana de extensĆ£o 2012 folia
ProgramaĆ§Ć£o semana de extensĆ£o 2012 foliaProgramaĆ§Ć£o semana de extensĆ£o 2012 folia
ProgramaĆ§Ć£o semana de extensĆ£o 2012 folia
Ā 
Yoga ocular
Yoga ocularYoga ocular
Yoga ocular
Ā 
Reconociendo habilidades
Reconociendo habilidadesReconociendo habilidades
Reconociendo habilidades
Ā 
Saludo protocolar al cuerpo diplomƔtico acreditado en panamƔ
Saludo protocolar al cuerpo diplomƔtico acreditado en panamƔSaludo protocolar al cuerpo diplomƔtico acreditado en panamƔ
Saludo protocolar al cuerpo diplomƔtico acreditado en panamƔ
Ā 
Pelleting: The link between practice and engineering
Pelleting: The link between practice and engineeringPelleting: The link between practice and engineering
Pelleting: The link between practice and engineering
Ā 
Annual Subsea Meeting Registration Form
Annual Subsea Meeting Registration FormAnnual Subsea Meeting Registration Form
Annual Subsea Meeting Registration Form
Ā 
Strategies for Vacant Properties - John Mills, Camelot Property Management
Strategies for Vacant Properties - John Mills, Camelot Property ManagementStrategies for Vacant Properties - John Mills, Camelot Property Management
Strategies for Vacant Properties - John Mills, Camelot Property Management
Ā 
EBT Health & Safety
EBT Health & SafetyEBT Health & Safety
EBT Health & Safety
Ā 
Pulpo congelado en espaƱa
Pulpo congelado en espaƱaPulpo congelado en espaƱa
Pulpo congelado en espaƱa
Ā 
FLOW - Far and Large Offshore Wind (Summary)
FLOW - Far and Large Offshore Wind (Summary)FLOW - Far and Large Offshore Wind (Summary)
FLOW - Far and Large Offshore Wind (Summary)
Ā 
B2b newsletter_april
B2b newsletter_aprilB2b newsletter_april
B2b newsletter_april
Ā 
Cardiodiagnostico
CardiodiagnosticoCardiodiagnostico
Cardiodiagnostico
Ā 
Cyclus-US Portfolio
Cyclus-US PortfolioCyclus-US Portfolio
Cyclus-US Portfolio
Ā 
La gestiĆ³n de la elecciĆ³n gd e e mail
La gestiĆ³n de la elecciĆ³n gd e e mailLa gestiĆ³n de la elecciĆ³n gd e e mail
La gestiĆ³n de la elecciĆ³n gd e e mail
Ā 
Aguahidratacion cuso4b
Aguahidratacion cuso4bAguahidratacion cuso4b
Aguahidratacion cuso4b
Ā 
Final.gest.hum.1
Final.gest.hum.1Final.gest.hum.1
Final.gest.hum.1
Ā 
Ba38
Ba38Ba38
Ba38
Ā 
Numero
NumeroNumero
Numero
Ā 

Similar to Operational Efficiency Hacks Web20 Expo2009

Capacity Management for Web Operations
Capacity Management for Web OperationsCapacity Management for Web Operations
Capacity Management for Web OperationsJohn Allspaw
Ā 
CPU Optimizations in the CERN Cloud - February 2016
CPU Optimizations in the CERN Cloud - February 2016CPU Optimizations in the CERN Cloud - February 2016
CPU Optimizations in the CERN Cloud - February 2016Belmiro Moreira
Ā 
How to build a state-of-the-art rails cluster
How to build a state-of-the-art rails clusterHow to build a state-of-the-art rails cluster
How to build a state-of-the-art rails clusterTim Lossen
Ā 
Enterprise Search Summit - Speeding Up Search
Enterprise Search Summit - Speeding Up SearchEnterprise Search Summit - Speeding Up Search
Enterprise Search Summit - Speeding Up SearchAzul Systems Inc.
Ā 
MySQL Group Replication - Ready For Production? (2018-04)
MySQL Group Replication - Ready For Production? (2018-04)MySQL Group Replication - Ready For Production? (2018-04)
MySQL Group Replication - Ready For Production? (2018-04)Kenny Gryp
Ā 
Capacity Management from Flickr
Capacity Management from FlickrCapacity Management from Flickr
Capacity Management from Flickrxlight
Ā 
Capacity Planning For Web Operations Presentation
Capacity Planning For Web Operations PresentationCapacity Planning For Web Operations Presentation
Capacity Planning For Web Operations Presentationjward5519
Ā 
Capacity Planning For Web Operations Presentation
Capacity Planning For Web Operations PresentationCapacity Planning For Web Operations Presentation
Capacity Planning For Web Operations Presentationjward5519
Ā 
Apache Gearpump - Lightweight Real-time Streaming Engine
Apache Gearpump - Lightweight Real-time Streaming EngineApache Gearpump - Lightweight Real-time Streaming Engine
Apache Gearpump - Lightweight Real-time Streaming EngineTianlun Zhang
Ā 
Scylla Summit 2018: OLAP or OLTP? Why Not Both?
Scylla Summit 2018: OLAP or OLTP? Why Not Both?Scylla Summit 2018: OLAP or OLTP? Why Not Both?
Scylla Summit 2018: OLAP or OLTP? Why Not Both?ScyllaDB
Ā 
2016-JAN-28 -- High Performance Production Databases on Ceph
2016-JAN-28 -- High Performance Production Databases on Ceph2016-JAN-28 -- High Performance Production Databases on Ceph
2016-JAN-28 -- High Performance Production Databases on CephCeph Community
Ā 
Cloudcon East Presentation
Cloudcon East PresentationCloudcon East Presentation
Cloudcon East Presentationbr7tt
Ā 
Cloudcon East Presentation
Cloudcon East PresentationCloudcon East Presentation
Cloudcon East Presentationbr7tt
Ā 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...Amazon Web Services
Ā 
Couchbase live 2016
Couchbase live 2016Couchbase live 2016
Couchbase live 2016Pierre Mavro
Ā 
Weakly Supervised Whole Slide Image Analysis Using Cloud Computing
Weakly Supervised Whole Slide Image Analysis Using Cloud ComputingWeakly Supervised Whole Slide Image Analysis Using Cloud Computing
Weakly Supervised Whole Slide Image Analysis Using Cloud ComputingSean Yu
Ā 
Smashing Big Data with AHA Hardware GZIP
Smashing Big Data with AHA Hardware GZIPSmashing Big Data with AHA Hardware GZIP
Smashing Big Data with AHA Hardware GZIPJuan D. Deaton, Ph.D.
Ā 
Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey J On The Beach
Ā 

Similar to Operational Efficiency Hacks Web20 Expo2009 (20)

Capacity Management for Web Operations
Capacity Management for Web OperationsCapacity Management for Web Operations
Capacity Management for Web Operations
Ā 
CPU Optimizations in the CERN Cloud - February 2016
CPU Optimizations in the CERN Cloud - February 2016CPU Optimizations in the CERN Cloud - February 2016
CPU Optimizations in the CERN Cloud - February 2016
Ā 
How to build a state-of-the-art rails cluster
How to build a state-of-the-art rails clusterHow to build a state-of-the-art rails cluster
How to build a state-of-the-art rails cluster
Ā 
Enterprise Search Summit - Speeding Up Search
Enterprise Search Summit - Speeding Up SearchEnterprise Search Summit - Speeding Up Search
Enterprise Search Summit - Speeding Up Search
Ā 
MySQL Group Replication - Ready For Production? (2018-04)
MySQL Group Replication - Ready For Production? (2018-04)MySQL Group Replication - Ready For Production? (2018-04)
MySQL Group Replication - Ready For Production? (2018-04)
Ā 
Capacity Management from Flickr
Capacity Management from FlickrCapacity Management from Flickr
Capacity Management from Flickr
Ā 
Capacity Planning For Web Operations Presentation
Capacity Planning For Web Operations PresentationCapacity Planning For Web Operations Presentation
Capacity Planning For Web Operations Presentation
Ā 
Capacity Planning For Web Operations Presentation
Capacity Planning For Web Operations PresentationCapacity Planning For Web Operations Presentation
Capacity Planning For Web Operations Presentation
Ā 
Apache Gearpump - Lightweight Real-time Streaming Engine
Apache Gearpump - Lightweight Real-time Streaming EngineApache Gearpump - Lightweight Real-time Streaming Engine
Apache Gearpump - Lightweight Real-time Streaming Engine
Ā 
Scylla Summit 2018: OLAP or OLTP? Why Not Both?
Scylla Summit 2018: OLAP or OLTP? Why Not Both?Scylla Summit 2018: OLAP or OLTP? Why Not Both?
Scylla Summit 2018: OLAP or OLTP? Why Not Both?
Ā 
2016-JAN-28 -- High Performance Production Databases on Ceph
2016-JAN-28 -- High Performance Production Databases on Ceph2016-JAN-28 -- High Performance Production Databases on Ceph
2016-JAN-28 -- High Performance Production Databases on Ceph
Ā 
Cloudcon East Presentation
Cloudcon East PresentationCloudcon East Presentation
Cloudcon East Presentation
Ā 
Cloudcon East Presentation
Cloudcon East PresentationCloudcon East Presentation
Cloudcon East Presentation
Ā 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
Ā 
Couchbase live 2016
Couchbase live 2016Couchbase live 2016
Couchbase live 2016
Ā 
Weakly Supervised Whole Slide Image Analysis Using Cloud Computing
Weakly Supervised Whole Slide Image Analysis Using Cloud ComputingWeakly Supervised Whole Slide Image Analysis Using Cloud Computing
Weakly Supervised Whole Slide Image Analysis Using Cloud Computing
Ā 
Smashing Big Data with AHA Hardware GZIP
Smashing Big Data with AHA Hardware GZIPSmashing Big Data with AHA Hardware GZIP
Smashing Big Data with AHA Hardware GZIP
Ā 
Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey
Ā 
Deep Dive on Amazon EC2
Deep Dive on Amazon EC2Deep Dive on Amazon EC2
Deep Dive on Amazon EC2
Ā 
Deep Dive on Amazon EC2
Deep Dive on Amazon EC2Deep Dive on Amazon EC2
Deep Dive on Amazon EC2
Ā 

More from John Allspaw

Resilience Engineering: A field of study, a community, and some perspective s...
Resilience Engineering: A field of study, a community, and some perspective s...Resilience Engineering: A field of study, a community, and some perspective s...
Resilience Engineering: A field of study, a community, and some perspective s...John Allspaw
Ā 
Considerations for Alert Design
Considerations for Alert DesignConsiderations for Alert Design
Considerations for Alert DesignJohn Allspaw
Ā 
Velocity EU 2012 Escalating Scenarios: Outage Handling Pitfalls
Velocity EU 2012 Escalating Scenarios: Outage Handling PitfallsVelocity EU 2012 Escalating Scenarios: Outage Handling Pitfalls
Velocity EU 2012 Escalating Scenarios: Outage Handling PitfallsJohn Allspaw
Ā 
Responding to Outages Maturely
Responding to Outages MaturelyResponding to Outages Maturely
Responding to Outages MaturelyJohn Allspaw
Ā 
Resilient Response In Complex Systems
Resilient Response In Complex SystemsResilient Response In Complex Systems
Resilient Response In Complex SystemsJohn Allspaw
Ā 
Outages, PostMortems, and Human Error
Outages, PostMortems, and Human ErrorOutages, PostMortems, and Human Error
Outages, PostMortems, and Human ErrorJohn Allspaw
Ā 
Anticipation: What Could Possibly Go Wrong?
Anticipation: What Could Possibly Go Wrong?Anticipation: What Could Possibly Go Wrong?
Anticipation: What Could Possibly Go Wrong?John Allspaw
Ā 
Advanced PostMortem Fu and Human Error 101 (Velocity 2011)
Advanced PostMortem Fu and Human Error 101 (Velocity 2011)Advanced PostMortem Fu and Human Error 101 (Velocity 2011)
Advanced PostMortem Fu and Human Error 101 (Velocity 2011)John Allspaw
Ā 
Dev and Ops Collaboration and Awareness at Etsy and Flickr
Dev and Ops Collaboration and Awareness at Etsy and FlickrDev and Ops Collaboration and Awareness at Etsy and Flickr
Dev and Ops Collaboration and Awareness at Etsy and FlickrJohn Allspaw
Ā 
Ops Meta-Metrics: The Currency You Pay For Change
Ops Meta-Metrics: The Currency You Pay For ChangeOps Meta-Metrics: The Currency You Pay For Change
Ops Meta-Metrics: The Currency You Pay For ChangeJohn Allspaw
Ā 
Ops Meta-Metrics: The Currency You Pay For Change
Ops Meta-Metrics: The Currency You Pay For ChangeOps Meta-Metrics: The Currency You Pay For Change
Ops Meta-Metrics: The Currency You Pay For ChangeJohn Allspaw
Ā 
Capacity Planning For LAMP
Capacity Planning For LAMPCapacity Planning For LAMP
Capacity Planning For LAMPJohn Allspaw
Ā 
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
10+ Deploys Per Day: Dev and Ops Cooperation at FlickrJohn Allspaw
Ā 
Capacity Planning for Web Operations - Web20 Expo 2008
Capacity Planning for Web Operations - Web20 Expo 2008Capacity Planning for Web Operations - Web20 Expo 2008
Capacity Planning for Web Operations - Web20 Expo 2008John Allspaw
Ā 

More from John Allspaw (14)

Resilience Engineering: A field of study, a community, and some perspective s...
Resilience Engineering: A field of study, a community, and some perspective s...Resilience Engineering: A field of study, a community, and some perspective s...
Resilience Engineering: A field of study, a community, and some perspective s...
Ā 
Considerations for Alert Design
Considerations for Alert DesignConsiderations for Alert Design
Considerations for Alert Design
Ā 
Velocity EU 2012 Escalating Scenarios: Outage Handling Pitfalls
Velocity EU 2012 Escalating Scenarios: Outage Handling PitfallsVelocity EU 2012 Escalating Scenarios: Outage Handling Pitfalls
Velocity EU 2012 Escalating Scenarios: Outage Handling Pitfalls
Ā 
Responding to Outages Maturely
Responding to Outages MaturelyResponding to Outages Maturely
Responding to Outages Maturely
Ā 
Resilient Response In Complex Systems
Resilient Response In Complex SystemsResilient Response In Complex Systems
Resilient Response In Complex Systems
Ā 
Outages, PostMortems, and Human Error
Outages, PostMortems, and Human ErrorOutages, PostMortems, and Human Error
Outages, PostMortems, and Human Error
Ā 
Anticipation: What Could Possibly Go Wrong?
Anticipation: What Could Possibly Go Wrong?Anticipation: What Could Possibly Go Wrong?
Anticipation: What Could Possibly Go Wrong?
Ā 
Advanced PostMortem Fu and Human Error 101 (Velocity 2011)
Advanced PostMortem Fu and Human Error 101 (Velocity 2011)Advanced PostMortem Fu and Human Error 101 (Velocity 2011)
Advanced PostMortem Fu and Human Error 101 (Velocity 2011)
Ā 
Dev and Ops Collaboration and Awareness at Etsy and Flickr
Dev and Ops Collaboration and Awareness at Etsy and FlickrDev and Ops Collaboration and Awareness at Etsy and Flickr
Dev and Ops Collaboration and Awareness at Etsy and Flickr
Ā 
Ops Meta-Metrics: The Currency You Pay For Change
Ops Meta-Metrics: The Currency You Pay For ChangeOps Meta-Metrics: The Currency You Pay For Change
Ops Meta-Metrics: The Currency You Pay For Change
Ā 
Ops Meta-Metrics: The Currency You Pay For Change
Ops Meta-Metrics: The Currency You Pay For ChangeOps Meta-Metrics: The Currency You Pay For Change
Ops Meta-Metrics: The Currency You Pay For Change
Ā 
Capacity Planning For LAMP
Capacity Planning For LAMPCapacity Planning For LAMP
Capacity Planning For LAMP
Ā 
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
Ā 
Capacity Planning for Web Operations - Web20 Expo 2008
Capacity Planning for Web Operations - Web20 Expo 2008Capacity Planning for Web Operations - Web20 Expo 2008
Capacity Planning for Web Operations - Web20 Expo 2008
Ā 

Recently uploaded

FULL ENJOY - 9953040155 Call Girls in Old Rajendra Nagar | Delhi
FULL ENJOY - 9953040155 Call Girls in Old Rajendra Nagar | DelhiFULL ENJOY - 9953040155 Call Girls in Old Rajendra Nagar | Delhi
FULL ENJOY - 9953040155 Call Girls in Old Rajendra Nagar | DelhiMalviyaNagarCallGirl
Ā 
Jeremy Casson - An Architectural and Historical Journey Around Europe
Jeremy Casson - An Architectural and Historical Journey Around EuropeJeremy Casson - An Architectural and Historical Journey Around Europe
Jeremy Casson - An Architectural and Historical Journey Around EuropeJeremy Casson
Ā 
RAK Call Girls Service # 971559085003 # Call Girl Service In RAK
RAK Call Girls Service # 971559085003 # Call Girl Service In RAKRAK Call Girls Service # 971559085003 # Call Girl Service In RAK
RAK Call Girls Service # 971559085003 # Call Girl Service In RAKedwardsara83
Ā 
Hazratganj / Call Girl in Lucknow - Phone šŸ«— 8923113531 ā˜› Escorts Service at 6...
Hazratganj / Call Girl in Lucknow - Phone šŸ«— 8923113531 ā˜› Escorts Service at 6...Hazratganj / Call Girl in Lucknow - Phone šŸ«— 8923113531 ā˜› Escorts Service at 6...
Hazratganj / Call Girl in Lucknow - Phone šŸ«— 8923113531 ā˜› Escorts Service at 6...akbard9823
Ā 
Lucknow šŸ’‹ Escorts Service Lucknow Phone No 8923113531 Elite Escort Service Av...
Lucknow šŸ’‹ Escorts Service Lucknow Phone No 8923113531 Elite Escort Service Av...Lucknow šŸ’‹ Escorts Service Lucknow Phone No 8923113531 Elite Escort Service Av...
Lucknow šŸ’‹ Escorts Service Lucknow Phone No 8923113531 Elite Escort Service Av...anilsa9823
Ā 
FULL ENJOY - 9953040155 Call Girls in Gtb Nagar | Delhi
FULL ENJOY - 9953040155 Call Girls in Gtb Nagar | DelhiFULL ENJOY - 9953040155 Call Girls in Gtb Nagar | Delhi
FULL ENJOY - 9953040155 Call Girls in Gtb Nagar | DelhiMalviyaNagarCallGirl
Ā 
Islamabad Escorts # 03080115551 # Escorts in Islamabad || Call Girls in Islam...
Islamabad Escorts # 03080115551 # Escorts in Islamabad || Call Girls in Islam...Islamabad Escorts # 03080115551 # Escorts in Islamabad || Call Girls in Islam...
Islamabad Escorts # 03080115551 # Escorts in Islamabad || Call Girls in Islam...wdefrd
Ā 
Call Girl in Bur Dubai O5286O4116 Indian Call Girls in Bur Dubai By VIP Bur D...
Call Girl in Bur Dubai O5286O4116 Indian Call Girls in Bur Dubai By VIP Bur D...Call Girl in Bur Dubai O5286O4116 Indian Call Girls in Bur Dubai By VIP Bur D...
Call Girl in Bur Dubai O5286O4116 Indian Call Girls in Bur Dubai By VIP Bur D...dajasot375
Ā 
Akola Call Girls #9907093804 Contact Number Escorts Service Akola
Akola Call Girls #9907093804 Contact Number Escorts Service AkolaAkola Call Girls #9907093804 Contact Number Escorts Service Akola
Akola Call Girls #9907093804 Contact Number Escorts Service Akolasrsj9000
Ā 
Bridge Fight Board by Daniel Johnson dtjohnsonart.com
Bridge Fight Board by Daniel Johnson dtjohnsonart.comBridge Fight Board by Daniel Johnson dtjohnsonart.com
Bridge Fight Board by Daniel Johnson dtjohnsonart.comthephillipta
Ā 
Hazratganj ] (Call Girls) in Lucknow - 450+ Call Girl Cash Payment šŸ§„ 89231135...
Hazratganj ] (Call Girls) in Lucknow - 450+ Call Girl Cash Payment šŸ§„ 89231135...Hazratganj ] (Call Girls) in Lucknow - 450+ Call Girl Cash Payment šŸ§„ 89231135...
Hazratganj ] (Call Girls) in Lucknow - 450+ Call Girl Cash Payment šŸ§„ 89231135...akbard9823
Ā 
SHIVNA SAHITYIKI APRIL JUNE 2024 Magazine
SHIVNA SAHITYIKI APRIL JUNE 2024 MagazineSHIVNA SAHITYIKI APRIL JUNE 2024 Magazine
SHIVNA SAHITYIKI APRIL JUNE 2024 MagazineShivna Prakashan
Ā 
exhuma plot and synopsis from the exhuma movie.pptx
exhuma plot and synopsis from the exhuma movie.pptxexhuma plot and synopsis from the exhuma movie.pptx
exhuma plot and synopsis from the exhuma movie.pptxKurikulumPenilaian
Ā 
FULL ENJOY - 9953040155 Call Girls in Noida | Delhi
FULL ENJOY - 9953040155 Call Girls in Noida | DelhiFULL ENJOY - 9953040155 Call Girls in Noida | Delhi
FULL ENJOY - 9953040155 Call Girls in Noida | DelhiMalviyaNagarCallGirl
Ā 
Call Girl Service In Dubai #$# O56521286O #$# Dubai Call Girls
Call Girl Service In Dubai #$# O56521286O #$# Dubai Call GirlsCall Girl Service In Dubai #$# O56521286O #$# Dubai Call Girls
Call Girl Service In Dubai #$# O56521286O #$# Dubai Call Girlsparisharma5056
Ā 
FULL ENJOY - 9953040155 Call Girls in Uttam Nagar | Delhi
FULL ENJOY - 9953040155 Call Girls in Uttam Nagar | DelhiFULL ENJOY - 9953040155 Call Girls in Uttam Nagar | Delhi
FULL ENJOY - 9953040155 Call Girls in Uttam Nagar | DelhiMalviyaNagarCallGirl
Ā 
Aminabad @ Book Call Girls in Lucknow - 450+ Call Girl Cash Payment šŸµ 8923113...
Aminabad @ Book Call Girls in Lucknow - 450+ Call Girl Cash Payment šŸµ 8923113...Aminabad @ Book Call Girls in Lucknow - 450+ Call Girl Cash Payment šŸµ 8923113...
Aminabad @ Book Call Girls in Lucknow - 450+ Call Girl Cash Payment šŸµ 8923113...akbard9823
Ā 
FULL ENJOY - 9953040155 Call Girls in Moti Nagar | Delhi
FULL ENJOY - 9953040155 Call Girls in Moti Nagar | DelhiFULL ENJOY - 9953040155 Call Girls in Moti Nagar | Delhi
FULL ENJOY - 9953040155 Call Girls in Moti Nagar | DelhiMalviyaNagarCallGirl
Ā 
Turn Lock Take Key Storyboard Daniel Johnson
Turn Lock Take Key Storyboard Daniel JohnsonTurn Lock Take Key Storyboard Daniel Johnson
Turn Lock Take Key Storyboard Daniel Johnsonthephillipta
Ā 

Recently uploaded (20)

FULL ENJOY - 9953040155 Call Girls in Old Rajendra Nagar | Delhi
FULL ENJOY - 9953040155 Call Girls in Old Rajendra Nagar | DelhiFULL ENJOY - 9953040155 Call Girls in Old Rajendra Nagar | Delhi
FULL ENJOY - 9953040155 Call Girls in Old Rajendra Nagar | Delhi
Ā 
Jeremy Casson - An Architectural and Historical Journey Around Europe
Jeremy Casson - An Architectural and Historical Journey Around EuropeJeremy Casson - An Architectural and Historical Journey Around Europe
Jeremy Casson - An Architectural and Historical Journey Around Europe
Ā 
RAK Call Girls Service # 971559085003 # Call Girl Service In RAK
RAK Call Girls Service # 971559085003 # Call Girl Service In RAKRAK Call Girls Service # 971559085003 # Call Girl Service In RAK
RAK Call Girls Service # 971559085003 # Call Girl Service In RAK
Ā 
Hazratganj / Call Girl in Lucknow - Phone šŸ«— 8923113531 ā˜› Escorts Service at 6...
Hazratganj / Call Girl in Lucknow - Phone šŸ«— 8923113531 ā˜› Escorts Service at 6...Hazratganj / Call Girl in Lucknow - Phone šŸ«— 8923113531 ā˜› Escorts Service at 6...
Hazratganj / Call Girl in Lucknow - Phone šŸ«— 8923113531 ā˜› Escorts Service at 6...
Ā 
Lucknow šŸ’‹ Escorts Service Lucknow Phone No 8923113531 Elite Escort Service Av...
Lucknow šŸ’‹ Escorts Service Lucknow Phone No 8923113531 Elite Escort Service Av...Lucknow šŸ’‹ Escorts Service Lucknow Phone No 8923113531 Elite Escort Service Av...
Lucknow šŸ’‹ Escorts Service Lucknow Phone No 8923113531 Elite Escort Service Av...
Ā 
FULL ENJOY - 9953040155 Call Girls in Gtb Nagar | Delhi
FULL ENJOY - 9953040155 Call Girls in Gtb Nagar | DelhiFULL ENJOY - 9953040155 Call Girls in Gtb Nagar | Delhi
FULL ENJOY - 9953040155 Call Girls in Gtb Nagar | Delhi
Ā 
Islamabad Escorts # 03080115551 # Escorts in Islamabad || Call Girls in Islam...
Islamabad Escorts # 03080115551 # Escorts in Islamabad || Call Girls in Islam...Islamabad Escorts # 03080115551 # Escorts in Islamabad || Call Girls in Islam...
Islamabad Escorts # 03080115551 # Escorts in Islamabad || Call Girls in Islam...
Ā 
Call Girl in Bur Dubai O5286O4116 Indian Call Girls in Bur Dubai By VIP Bur D...
Call Girl in Bur Dubai O5286O4116 Indian Call Girls in Bur Dubai By VIP Bur D...Call Girl in Bur Dubai O5286O4116 Indian Call Girls in Bur Dubai By VIP Bur D...
Call Girl in Bur Dubai O5286O4116 Indian Call Girls in Bur Dubai By VIP Bur D...
Ā 
Akola Call Girls #9907093804 Contact Number Escorts Service Akola
Akola Call Girls #9907093804 Contact Number Escorts Service AkolaAkola Call Girls #9907093804 Contact Number Escorts Service Akola
Akola Call Girls #9907093804 Contact Number Escorts Service Akola
Ā 
Bridge Fight Board by Daniel Johnson dtjohnsonart.com
Bridge Fight Board by Daniel Johnson dtjohnsonart.comBridge Fight Board by Daniel Johnson dtjohnsonart.com
Bridge Fight Board by Daniel Johnson dtjohnsonart.com
Ā 
Hazratganj ] (Call Girls) in Lucknow - 450+ Call Girl Cash Payment šŸ§„ 89231135...
Hazratganj ] (Call Girls) in Lucknow - 450+ Call Girl Cash Payment šŸ§„ 89231135...Hazratganj ] (Call Girls) in Lucknow - 450+ Call Girl Cash Payment šŸ§„ 89231135...
Hazratganj ] (Call Girls) in Lucknow - 450+ Call Girl Cash Payment šŸ§„ 89231135...
Ā 
SHIVNA SAHITYIKI APRIL JUNE 2024 Magazine
SHIVNA SAHITYIKI APRIL JUNE 2024 MagazineSHIVNA SAHITYIKI APRIL JUNE 2024 Magazine
SHIVNA SAHITYIKI APRIL JUNE 2024 Magazine
Ā 
Dxb Call Girls # +971529501107 # Call Girls In Dxb Dubai || (UAE)
Dxb Call Girls # +971529501107 # Call Girls In Dxb Dubai || (UAE)Dxb Call Girls # +971529501107 # Call Girls In Dxb Dubai || (UAE)
Dxb Call Girls # +971529501107 # Call Girls In Dxb Dubai || (UAE)
Ā 
exhuma plot and synopsis from the exhuma movie.pptx
exhuma plot and synopsis from the exhuma movie.pptxexhuma plot and synopsis from the exhuma movie.pptx
exhuma plot and synopsis from the exhuma movie.pptx
Ā 
FULL ENJOY - 9953040155 Call Girls in Noida | Delhi
FULL ENJOY - 9953040155 Call Girls in Noida | DelhiFULL ENJOY - 9953040155 Call Girls in Noida | Delhi
FULL ENJOY - 9953040155 Call Girls in Noida | Delhi
Ā 
Call Girl Service In Dubai #$# O56521286O #$# Dubai Call Girls
Call Girl Service In Dubai #$# O56521286O #$# Dubai Call GirlsCall Girl Service In Dubai #$# O56521286O #$# Dubai Call Girls
Call Girl Service In Dubai #$# O56521286O #$# Dubai Call Girls
Ā 
FULL ENJOY - 9953040155 Call Girls in Uttam Nagar | Delhi
FULL ENJOY - 9953040155 Call Girls in Uttam Nagar | DelhiFULL ENJOY - 9953040155 Call Girls in Uttam Nagar | Delhi
FULL ENJOY - 9953040155 Call Girls in Uttam Nagar | Delhi
Ā 
Aminabad @ Book Call Girls in Lucknow - 450+ Call Girl Cash Payment šŸµ 8923113...
Aminabad @ Book Call Girls in Lucknow - 450+ Call Girl Cash Payment šŸµ 8923113...Aminabad @ Book Call Girls in Lucknow - 450+ Call Girl Cash Payment šŸµ 8923113...
Aminabad @ Book Call Girls in Lucknow - 450+ Call Girl Cash Payment šŸµ 8923113...
Ā 
FULL ENJOY - 9953040155 Call Girls in Moti Nagar | Delhi
FULL ENJOY - 9953040155 Call Girls in Moti Nagar | DelhiFULL ENJOY - 9953040155 Call Girls in Moti Nagar | Delhi
FULL ENJOY - 9953040155 Call Girls in Moti Nagar | Delhi
Ā 
Turn Lock Take Key Storyboard Daniel Johnson
Turn Lock Take Key Storyboard Daniel JohnsonTurn Lock Take Key Storyboard Daniel Johnson
Turn Lock Take Key Storyboard Daniel Johnson
Ā 

Operational Efficiency Hacks Web20 Expo2009

  • 1. Operational Efļ¬ciency Hacks John Allspaw Operations Engineering, Flickr
  • 2. who am I? Manage the Flickr Operations group Wrote a geeky book:
  • 4. ā€œEfļ¬cienciesā€ Doing more with the robots youā€™ve got
  • 5. ā€œEfļ¬cienciesā€ Doing more with the robots youā€™ve got Doing more with the humans youā€™ve got
  • 6. Some optimization ā€œrulesā€
  • 7. Some optimization ā€œrulesā€ - Donā€™t rely on being able to tweak anything.
  • 8. Some optimization ā€œrulesā€ - Donā€™t rely on being able to tweak anything. - Donā€™t waste too much time tuning when you have no evidence itā€™ll matter.
  • 10. Optimization ā€œrulesā€ ā€œWe should forget about small efļ¬ciencies, say about 97% of the time: premature optimization is the root of all evil.ā€ Knuth, (or Hoare)
  • 13. Optimization ā€œrulesā€ That doesnā€™t give us an excuse to be lazy and inefļ¬cient.
  • 14. Optimization ā€œrulesā€ That doesnā€™t give us an excuse to be lazy and inefļ¬cient.
  • 15. Optimization ā€œrulesā€ That doesnā€™t give us an excuse to be lazy and inefļ¬cient. We lean on the experience of people in the community for evidence that tuning(s) might be a worthwhile thing to do.
  • 16. Optimization ā€œrulesā€ ā€œYet we should not pass up our opportunities in that critical 3 percent.ā€ Knuth, (or Hoare)
  • 17. So... stop somewhere in here OMG obvious I'm wasting tuning !@#$ time wins for no reason
  • 19. Our Context - 24 TB of MySQL data
  • 20. Our Context - 24 TB of MySQL data - 32k/sec of MySQL writes
  • 21. Our Context - 24 TB of MySQL data - 32k/sec of MySQL writes - 120k/sec of MySQL reads
  • 22. Our Context - 24 TB of MySQL data - 32k/sec of MySQL writes - 120k/sec of MySQL reads - 6 PB of photos
  • 23. Our Context - 24 TB of MySQL data - 32k/sec of MySQL writes - 120k/sec of MySQL reads - 6 PB of photos - 10TB storage eaten per day
  • 24. Our Context - 24 TB of MySQL data - 32k/sec of MySQL writes - 120k/sec of MySQL reads - 6 PB of photos - 10TB storage eaten per day - 15,362 service monitors (alerts)
  • 26. Infrastructure Hacks - Examples of what changing software can do (plain old-fashioned performance tuning)
  • 27. Infrastructure Hacks - Examples of what changing software can do (plain old-fashioned performance tuning) - Examples of what changing hardware can do (yay for Mr. Moore!)
  • 28. Leaning on compilers (synthetic PHP benchmarks, not real-world) (http://sebastian-bergmann.de/archives/634-PHP-GCC- ICC-Benchmark.html)
  • 29. PHP (real-world) php 4.4.8 to php 5.2.8 migration
  • 30. Can now handle more with less same taste, less ļ¬lling
  • 32. Image Processing - 2004, Flickr was using ImageMagick for image processing (version 6.1.9)
  • 33. Image Processing - 2004, Flickr was using ImageMagick for image processing (version 6.1.9) - Changed to GraphicsMagick, about 15% faster at the time (version 1.1.5)
  • 34. Image Processing - 2004, Flickr was using ImageMagick for image processing (version 6.1.9) - Changed to GraphicsMagick, about 15% faster at the time (version 1.1.5) - Only need a subset of ImageMagick features anyway for our purposes
  • 35. Image Processing - OpenMP support (http://en.wikipedia.org/wiki/Openmp) - Allows parallelization of processing jobs, using multiple cores working on the same image - Some algorithms have more parallelization than others
  • 36. Image Processing - Test script - 7 large-ish DSLR photos - Cascade resizing each to 6 smaller sizes, semi-typical for Flickrā€™s workload - Each resize processed serially
  • 37. Image Processing compiler differences (GM version 1.1.14, non-OpenMP)
  • 38. Image Processing OpenMP differences OpenMP advantage (gcc 4.1.2, on quad core Xeon L5335 @ 2.00GHz)
  • 39. Image Processing CPU differences
  • 41. Diagonal Scaling - Vertically scaling your already horizontally- scaled nodes
  • 42. Diagonal Scaling - Vertically scaling your already horizontally- scaled nodes - a.k.a. ā€œtech refreshā€
  • 43. Diagonal Scaling - Vertically scaling your already horizontally- scaled nodes - a.k.a. ā€œtech refreshā€ - a.k.a. ā€œMooreā€™s Law Surļ¬ngā€
  • 45. Diagonal Scaling 67 ā€œoldā€ webservers with 18 ā€œnewā€ : We replaced
  • 46. Diagonal Scaling 67 ā€œoldā€ webservers with 18 ā€œnewā€ : We replaced CPUs RAM drives total power (W) servers @60% peak per server per server per server 67 2 8763.6 4GB 1x80GB 18 8 2332.8 4GB 1x146GB
  • 47. Diagonal Scaling 67 ā€œoldā€ webservers with 18 ā€œnewā€ : We replaced CPUs RAM drives total power (W) servers @60% peak ~70% LESS power per server per server per server 67 2 8763.6 4GB 1x80GB 49U LESS rack space 18 8 2332.8 4GB 1x146GB
  • 49. Diagonal Scaling 23 ā€œoldā€ image processing boxes with 8 ā€œnewā€ We replaced
  • 50. Diagonal Scaling 23 ā€œoldā€ image processing boxes with 8 ā€œnewā€ We replaced server photos/min rack total power (W) @60% peak 23 1035 23 3008.4 8 1120 8 1036.8
  • 51. Diagonal Scaling 23 ā€œoldā€ image processing boxes with 8 ā€œnewā€ We replaced server photos/min rack total power (W) @60% peak ~75% FASTER 23 1035 23 3008.4 15U LESS rack space 65% LESS power 8 8 1120 1036.8
  • 52. Diagonal Scaling 23 ā€œoldā€ image processing boxes with 8 ā€œnewā€ We replaced server photos/min rack total power (W) ~75% FASTER @60% peak 15U LESS rack space 23 1035 23 3008.4 65% LESS power 8 8 1120 1036.8
  • 53. Diagonal Scaling 23 ā€œoldā€ image processing boxes with 8 ā€œnewā€ We replaced server photos/min rack total power (W) ~75% FASTER @60% peak 15U LESS rack space 23 1035 23 3008.4 65% LESS power 8 8 1120 1036.8
  • 54. Diagonal Scaling 23 ā€œoldā€ image processing boxes with 8 ā€œnewā€ We replaced server photos/min rack total power (W) ~75% FASTER @60% peak 15U LESS rack space 23 1035 23 3008.4 65% LESS power 8 8 1120 1036.8 from this to this
  • 55. What do you do with old/slow machines?
  • 56. What do you do with old/slow machines? - Liquidate
  • 57. What do you do with old/slow machines? - Liquidate - Re-purpose as dev/staging/etc
  • 58. What do you do with old/slow machines? - Liquidate - Re-purpose as dev/staging/etc - ā€œofļ¬‚ineā€ tasks
  • 60. Ofļ¬‚ine Tasks - Out-of-band/asynchronous queuing and execution system, for non-realtime tasks
  • 61. Ofļ¬‚ine Tasks - Out-of-band/asynchronous queuing and execution system, for non-realtime tasks - See here:
  • 62. Ofļ¬‚ine Tasks - Out-of-band/asynchronous queuing and execution system, for non-realtime tasks - See here: http://code.ļ¬‚ickr.com/blog/2008/09/26/ļ¬‚ickr-engineers-do-it-ofļ¬‚ine/
  • 63. Ofļ¬‚ine Tasks - Out-of-band/asynchronous queuing and execution system, for non-realtime tasks - See here: http://code.ļ¬‚ickr.com/blog/2008/09/26/ļ¬‚ickr-engineers-do-it-ofļ¬‚ine/ - See Myles Grant talk about it more here:
  • 64. Ofļ¬‚ine Tasks - Out-of-band/asynchronous queuing and execution system, for non-realtime tasks - See here: http://code.ļ¬‚ickr.com/blog/2008/09/26/ļ¬‚ickr-engineers-do-it-ofļ¬‚ine/ - See Myles Grant talk about it more here: http://en.oreilly.com/velocity2009/public/schedule/detail/7552
  • 65. Runbook Hacks ā€œWTF HAPPENED LAST NIGHT?!ā€
  • 66. Why?
  • 67. Why? As infrastructure grows, try to keep the Humans:Machines ratio from getting out of hand
  • 68. Why? As infrastructure grows, try to keep the Humans:Machines ratio from getting out of hand Some of the How:
  • 69. Why? As infrastructure grows, try to keep the Humans:Machines ratio from getting out of hand Some of the How: - teach machines to build themselves
  • 70. Why? As infrastructure grows, try to keep the Humans:Machines ratio from getting out of hand Some of the How: - teach machines to build themselves - teach machines to watch themselves
  • 71. Why? As infrastructure grows, try to keep the Humans:Machines ratio from getting out of hand Some of the How: - teach machines to build themselves - teach machines to watch themselves - teach machines to ļ¬x themselves
  • 72. Why? As infrastructure grows, try to keep the Humans:Machines ratio from getting out of hand Some of the How: - teach machines to build themselves - teach machines to watch themselves - teach machines to ļ¬x themselves - reduce MTTR by streamlining
  • 74. Automated Infrastructure - If there is only one thing you do, automatic conļ¬guration and deployment management should be it.
  • 75. Automated Infrastructure - If there is only one thing you do, automatic conļ¬guration and deployment management should be it. - See: - Opscode/Chef (http://opscode.com/) - Puppet (http://reductivelabs.com/products/puppet/) - System Imager/Conļ¬gurator (http://wiki.systemimager.org)
  • 77. Time Machine time is cheaper than human time. If a failure results in some commands being run to ā€˜ļ¬xā€™ it, make the machines do it. (i.e., donā€™t wake people up for stupid things!)
  • 79. Aggregate Monitoring Donā€™t care about single nodes, only care about delta change of metrics/faults - Warn (email) on X % change - Page (wake up) on Y % change
  • 80. Aggregate Monitoring Donā€™t care about single nodes, only care about delta change of metrics/faults - Warn (email) on X % change - Page (wake up) on Y % change High and low water marks for some metrics
  • 82. Self-Healing Make service monitoring ļ¬x common failure scenarios, notify us later about it.
  • 83. Self-Healing Make service monitoring ļ¬x common failure scenarios, notify us later about it.
  • 84. Self-Healing Make service monitoring ļ¬x common failure scenarios, notify us later about it. Daemons/processes run on machines, will take corrective action under certain conditions, and report back with what they did.
  • 85. Self-Healing Make service monitoring ļ¬x common failure scenarios, notify us later about it. Daemons/processes run on machines, will take corrective action under certain conditions, and report back with what they did.
  • 86. Self-Healing Make service monitoring ļ¬x common failure scenarios, notify us later about it. Daemons/processes run on machines, will take corrective action under certain conditions, and report back with what they did. Can greatly reduce your mean time to recovery (MTTR)
  • 87. Self-Healing Make service monitoring ļ¬x common failure scenarios, notify us later about it. Daemons/processes run on machines, will take corrective action under certain conditions, and report back with what they did. Can greatly reduce your mean time to recovery (MTTR)
  • 89. Basic Apache Example 1. Webserver not running?
  • 90. Basic Apache Example 1. Webserver not running? 2. Under certain conditions, try to start it, and email that this happened. (Iā€™ll read it tomorrow)
  • 91. Basic Apache Example 1. Webserver not running? 2. Under certain conditions, try to start it, and email that this happened. (Iā€™ll read it tomorrow) 3. Wonā€™t start? Assume somethingā€™s really wrong, so donā€™t keep trying (email that, too)
  • 93. MySQL Self-Healing Some MySQL Issues ā€œļ¬xedā€ by the machines
  • 94. MySQL Self-Healing Some MySQL Issues ā€œļ¬xedā€ by the machines
  • 95. MySQL Self-Healing Some MySQL Issues ā€œļ¬xedā€ by the machines - Kill long-running SELECT queries (marked safe to kill)
  • 96. MySQL Self-Healing Some MySQL Issues ā€œļ¬xedā€ by the machines - Kill long-running SELECT queries (marked safe to kill) - Queries not safe to kill are marked by the application as ā€œNO KILLā€ in comments
  • 97. MySQL Self-Healing Some MySQL Issues ā€œļ¬xedā€ by the machines - Kill long-running SELECT queries (marked safe to kill) - Queries not safe to kill are marked by the application as ā€œNO KILLā€ in comments - Run EXPLAIN on killed queries, and report the results
  • 98. MySQL Self-Healing Some MySQL Issues ā€œļ¬xedā€ by the machines - Kill long-running SELECT queries (marked safe to kill) - Queries not safe to kill are marked by the application as ā€œNO KILLā€ in comments - Run EXPLAIN on killed queries, and report the results - Keep track of the query types and databases that need the most killing, produce a ā€œDBs that Suckā€ report
  • 100. MySQL Self-Healing Some MySQL Replication issues ā€œļ¬xedā€ by the machines, by error
  • 101. MySQL Self-Healing Some MySQL Replication issues ā€œļ¬xedā€ by the machines, by error - Skip errors that can safely be skipped and restart slave threads
  • 102. MySQL Self-Healing Some MySQL Replication issues ā€œļ¬xedā€ by the machines, by error - Skip errors that can safely be skipped and restart slave threads - Force refetch of replication binlogs on: - 1064 (ER_PARSE_ERROR)
  • 103. MySQL Self-Healing Some MySQL Replication issues ā€œļ¬xedā€ by the machines, by error - Skip errors that can safely be skipped and restart slave threads - Force refetch of replication binlogs on: - 1064 (ER_PARSE_ERROR) - Re-run queries on: - 1205 (ER_LOCK_WAIT_TIMEOUT) - 1213 (ER_LOCK_DEADLOCK)
  • 105. Code and Conļ¬g Deploy Logs
  • 106. Code and Conļ¬g Deploy Logs 1. ESSENTIAL
  • 107. Code and Conļ¬g Deploy Logs 1. ESSENTIAL 2. MANDATORY
  • 109. Communications ā€¢ Internal IRC - For ongoing discussions - Logged, so ā€œinļ¬niteā€ scrollback
  • 110. Communications ā€¢ Internal IRC - For ongoing discussions - Logged, so ā€œinļ¬niteā€ scrollback ā€¢ IM Bot (built on libyahoo2.sf.net) - For production changes - Broadcasts all to all contacts - Logged, and injected into IRC - IM Status = who is in primary/secondary on-call
  • 111. Communications ā€¢ Internal IRC - For ongoing discussions - Logged, so ā€œinļ¬niteā€ scrollback ā€¢ IM Bot (built on libyahoo2.sf.net) - For production changes - Broadcasts all to all contacts - Logged, and injected into IRC - IM Status = who is in primary/secondary on-call ā€¢ All of IRC and IM Bot slurped into a search index
  • 112.
  • 113. when
  • 114. when what
  • 115. when what detailed what*
  • 116. when what detailed what* *also points to what commands should be used to back out the changes
  • 117. when what detailed what* who *also points to what commands should be used to back out the changes
  • 118. when what detailed what* who *also points to what commands should be used to back out the changes
  • 119. when what detailed what* who time of last deploy at top of ganglia *also points to what commands should be used to back out the changes
  • 120.
  • 121.
  • 122. IM Bot (timestamps help correlation)
  • 123. IM Bot (timestamps help correlation)
  • 124. IM Bot (timestamps help correlation) all IRC, IM bot into searchable history
  • 125. Morals of Our Stories
  • 126. Morals of Our Stories - Optimizations can be a Very Good Thingā„¢
  • 127. Morals of Our Stories - Optimizations can be a Very Good Thingā„¢ - Weigh time spent optimizing against expected gains
  • 128. Morals of Our Stories - Optimizations can be a Very Good Thingā„¢ - Weigh time spent optimizing against expected gains - Lean on others for how much ā€œexpected gainsā€ mean for different scenarios
  • 129. Morals of Our Stories - Optimizations can be a Very Good Thingā„¢ - Weigh time spent optimizing against expected gains - Lean on others for how much ā€œexpected gainsā€ mean for different scenarios - Plain old-fashioned intuition
  • 130. Some Wisdom Nuggets Jon Prallā€™s 85 WebOps Rules: http://jprall.vox.com/library/post/85- operations-rules-to-live-by.html

Editor's Notes

  1. Finding huge gains from tweaking gets harder as you turn the knobs and pull the levers.
  2. He had some evidence that suggested compilers and versions would make a difference for PHP.
  3. So, we did it. (not just for performance reasons, but still...) Will this happen to you, too? Maybe. Maybe not.
  4. Performance gains like this don’t come very often, for “free”.
  5. We don’t need 80% of what ImageMagick has.
  6. We don’t need 80% of what ImageMagick has.
  7. We don’t need 80% of what ImageMagick has.
  8. Cascading resizing: Original -> Large Large -> Medium Large -> Small Medium -> Thumb Medium -> Square
  9. This was done before OpenMP support was in. Compilers and optimization flags can make a difference!
  10. Enter OpenMP. Example of how
  11. These have been the revisions of our image processing hardware over time. 4x faster
  12. Examples are contact notifications, large photoset deletions, etc.
  13. Examples are contact notifications, large photoset deletions, etc.
  14. Examples are contact notifications, large photoset deletions, etc.
  15. Examples are contact notifications, large photoset deletions, etc.
  16. Examples are contact notifications, large photoset deletions, etc.
  17. Examples are contact notifications, large photoset deletions, etc.
  18. Runbook hacks: tuning the process of operations failure handling and mitigation.
  19. Low water marks as indirect trouble indicators.
  20. Low water marks as indirect trouble indicators.
  21. Can be as simple as cron jobs, or nagios plugins. NRPE or NSCA.
  22. Can be as simple as cron jobs, or nagios plugins. NRPE or NSCA.
  23. Can be as simple as cron jobs, or nagios plugins. NRPE or NSCA.
  24. Can be as simple as cron jobs, or nagios plugins. NRPE or NSCA.
  25. Can be as simple as cron jobs, or nagios plugins. NRPE or NSCA.
  26. Can be as simple as cron jobs, or nagios plugins. NRPE or NSCA.
  27. Skippable errors: “can’t drop”, “already exists”, “duplicate key name”
  28. Skippable errors: “can’t drop”, “already exists”, “duplicate key name”
  29. Skippable errors: “can’t drop”, “already exists”, “duplicate key name”
  30. Skippable errors: “can’t drop”, “already exists”, “duplicate key name”
  31. Simple tips and tricks that can help in fixing things when they break.
  32. This shouldn’t be considered optional.
  33. This shouldn’t be considered optional.
  34. Our IRC and IM logs get injected into a search engine, almost exactly like Lucene.
  35. Our IRC and IM logs get injected into a search engine, almost exactly like Lucene.
  36. Our IRC and IM logs get injected into a search engine, almost exactly like Lucene.
  37. Our IRC and IM logs get injected into a search engine, almost exactly like Lucene.