Big Ruby Conf 2014 Notes
These are the notes I took while attending Big Ruby 2014 in Grapevine, Texas.
Confreaks now has the Big Ruby 2014 videos available for viewing. Harry Potter and the Legacy Codebase is a good watch. “Think of working with legacy code like an archaeologist.”
Testing the untestable
- Start with testing as though code is in a black box
- Heroku managed this by actually just git push/deploy/build to test build packs
heroku run bash
will put you in a terminal on your dyno- If heroku can deploy 6 times and have at least 1 passing test suite, they consider it successful
- Heroku test speeds went from 5 minutes per test suite to 44 tests in 12 minutes.
- He has worked at companies that had Rails tests going on for 30 minutes per suite
- parallel_rspec is a plugin to look in to for speeding up tests
- Testing allowed heroku to be agressive in refactoring.
- They use to code Min Viable Patching because they wanted to add/modify the least amount of code per patch
- build out smaller black boxes and unit test those smaller components
- Untestable scenarios
- Mock for determinism (webmock for small pages, VCR for larger network traffic)
- codetriage.com
- If things seem too big, start with integration tests
- Test things that would hurt if they break.
Business intelligence
- Supporting decision making
- Data is generally an afterthought to devs, it’s just a storage for objects
- How do you use the data that’s stored to make business decisions
- The naive approach
- “We can run reports straight from the transactional db”
- The complex joins are slow
- affects production performance
- Figure out the business needs instead of just feature requests
- They might not know about data that is valuable like (we know how loyal customers are by the last login date)
- Turn inferences into facts
- De-normalize data for the facts. It’s ok for data to be stored in many places if the facts are only stored once.
- Tables can communicate more than graphs
- “A” recipe for success
- Use Ruby - TDD, deployment, “data munging”, many Powerful ORMs
- Mongodb
- Flexible schemas
- Map/Reduce
- New Aggregation framework – Check this out
- Postgres for transactional
- Minimize on-demand calculations
- Store facts, not attributes (denormalized)
- SQL is the recipe and the cake is the report. store the cake too.
- Not storing objects, storing events (events don’t change)
Job processing
- Queues and Processors
- Queues
- Durability, Availability, Throughput (pick two)
- Durability - Message sent, how likely is it going to be received
- Availability - How fault tolerant
- Throughput - How many jobs can we push through
- Cost?
- RabbitMQ - Fast, durable, reasonably fault tolerant (entirely how you configure it)
- Amazon SQS - highly Durable, highly Available, reasonable throughput, Not fast
- Redis - High throughput, fairly durable, availability is hard, really fast
- Not all jobs are the same, Tapjoy has roughly 2 different types of jobs
- X00,000 / minute (events) - RabbitMQ
- 1,000,000 / hour (general jobs) - SQS
- Durability, Availability, Throughput (pick two)
- Processors
- Def
- A way to retrieve jobs
- A way to match that job to code
- A way to run that code
- Concurrently with RabbitMQ and Fibers
- Ruby - Forks, Fibers, Threads
- Rabbit - Async, Stateful connections (retrieve and say finished on same state), no batch processing
- Fibers - Coroutines on the main stack, Fibers are great for async, no worries about cross thread talking
- If CPU bound, it’s not worth it. however, IO bound is ok.
- Pain points
- SystemStackError - fixed, but size varies, frame for the stack under ruby (e.g., Can’t save an AR object in Fiber)
- Needs Fiber-aware network libraries
- Code needs to know its on a fiber
- Fibers aren’t good for gen purpose jobs, RabbitMQ is complex, He says he wished he looked harder at Celluloid.
- SQS Job Processor
- SQS - HTTP API, Synchronous, allows for batch processing
- Why not threads
- Global Interpreter Lock
- Thread Safety
- Memory Bloat
- Building the processor
- Threads to fetch
- For for each batch of messages
- child has a pipe to parent to share stats
- However, pipe processing is slow (tapjoy gets 6 children before CPU upper bounds)
- Resolv is the Ruby DNS in Ruby instead of C to prevent GIL, use this instead of C DNS if more than 1 thread
- Look in to learning to use GDB (also useful is the LLVM equal, lldb)
- Sometimes the bug is not your bug
- Chore, resque compat serialization, per-job config, jobs for each server, queue agnostic and Concurrency agnostic
- Not released yet :(
- Def
Shopify sharded Rails
- Sharding - data over more than 1 db
- Rails assumes 1 db
- Auto-Incr doesn’t work
- Normally can get by on tune and cache for years
- Why Shard
- Smaller indexes
- Better localization, data stays warmer longer
- No Joins between shards, normally there is a ‘stop and shard’ moment, shopify doesn’t do this.
- Denormalized everything to have a shop_id since everything is scoped through shop
- Noeqd, like snowflake for ids
- Make sure ids are javascript safe
- shopify ids are auto_incr + N * auto_incr
- Rebalancing (move across shards) lock the shop and move
Living Social lightning talks
- Check out Pry
- text-table gem for viewing data in the console
- Using thor to grep to view logs
- “Telling a programmer there is already a library for that is like telling a song writer there is already a song about love”
Castle on a cloud
- Look up ChatOps video
- common tasks are automated through hubot commands
- IAM is basically like LDAP. Can limit within a bucket too.
- look into graphite for graphing. I think it’s a python app
- VPC as a VPN
- GitHub is a MySQL shop and use RDS for even heroku-based apps
Working effectively on a distributed team
- As Remote Worker
- Be visible
- make sure people see your work
- speak up in meetings
- Use video, if you have to talk to someone, make it a video
- Be visible with your calendar (we currently don’t really do use)
- Be yourself
- Livingsocial has an ‘off topic’ room in campfire that allows people to show personalities. Being silly and being fun
- You need the water cooler time even online. work can be / is social
- People need to see you in addition to your work
- Livingsocial has an ‘off topic’ room in campfire that allows people to show personalities. Being silly and being fun
- Be disciplined
- Not just not being lazy, also be effective with the team (eg not working during odd hours)
- Setup a dedicated workspace
- Be available
- Be flexible
- Be free (e.g., if you want to move around, move around. just do your work)
- Be visible
- Use Video - it well worth the additional effort, even if it’s 6fps
- Screen Hero for paired programming (ss and audio)
Refactoring with science
- Github use to say “never break the API” now because of usage data, they can change and contact users
- Github deploys branches to production, not merge in to master, then deploy
- This allows for easier rollbacks by just deploying master again to revert
- dat-science allows for basically A/B Testing of code
- dat-analysis allows for visualizing the dat-science results
Building a service
- Before writing a service. They write a spec
- Curl-ish, Description, URL params, Request Body, Response (201 created, 409 exist… etc)
- Then they write the client. It allows them to understand error handling
- Writing configuration
- Rack::Test is good for testing API services
- Dev would be best in docker or vagrant, but Union metrics (Austin dev team) just ‘wings it’
- Ops is important in SLA
- Benefits
- small deploys, experiments (eg with different languages) because it only has to talk HTTP
- Experiments are double edged. happy programmers due to trying new things. but you might get boxed in to something
- client and servers are built in parallel for faster development. stubbing the server allows the client to be visualized quicker
Legacy codebase
- It’s not a problem until it’s a problem
- Think of working with legacy code like an archaeologist
- You have to dig, research, piece together… all to figure out what happened and why these choices were made
- Survey, Excavation, Analysis
- Survey
- Take inventory with tests, code comments, new relic, benchmarks
- Ask dumb questions and flag myths. you’ll be able to figure out the culture which leads to insights in the code
- Tricks -> Techniques -> Process -> Methodology -> Dogma (seems like you should watch out for this)
- Excavation
- If you have a goal to fix something, fix something. If something else comes out of the excavation, make an issue and get to it after your goal
- “First, do no harm”
- Use
warn
when making changes while leaving current implementation in place
- Analysis
- Keep up the documentation in blog posts, wiki articles, readme or even tests
- Document everything. Even the most trivial things like initing from scratch. Also keep it up to date
- Make a map. Search for gems that will make a rails UML diagrams
- The fear of looking at and working with a legacy codebase is all in your head
- Ask why is it the way it is
- It’s not bad code in the sense that it’s not working (it probably is working) but it just never went to refactor after green
Active interaction
- Fat controllers, skinny models make god models
- Rails says to focus models on NOUNS
- Consider making models with verbs
- active_interaction
Key models
- RDB - Relations, transactions, schemas, ability to extend and ad-hoc queries
- KVS - Schema-less, single-access reads, write-heavy (append only), easier to scale (restricted api), it’s just a hash
Managing fleets (could be Postgres)
- Clint is from Missouri. Not St. Luis, Not KC. Columbus.
- In the beginning was a sinatra app talking to aws using sequel
- The simplest thing that could possibly work, but no less
- Now 5 apps in sinatra, Fog to talk to aws, and still sequel
- They use Sinatra to map to other sinatra apps based on url
- 50 worker types, 100s of workers all doing the heavy lifting
- each machine only has OS, Postgres, and Wal-E
- Monitoring is from the outside in. Using State Machines and stateless workers
- This is kind of like in a game. Loop over: observe environment, do something based on env
- their queue is while resource, feel, do, requeue. over and over again
- Feeler - collects data on the resource. Append data to a observation table. Finds out state as well
- Think - Eval the state the resource is in. (eg if it’s uncertain, do x)
- Stateless (background) workers talk to aws, heroku api, postgres because they all require network connection
- Failure
- notify and push to another queue
- it either resolves or needs a human
- the notifications lead to the development of playbooks based on the type of incident notification.
- They have codified their playbook for different incidents that can resolve them (like restarting)
- “Have you tried turning it off and on again?” - can resolve a lot of aws issues because images will pop up on other machines
- If the codified resolver can’t do it, it says “I need an adult” and sends it a human
- State Machines are great
- Use Stateless workers
- Expect things to break
Understanding others
- Leveling up people skills
- Everyone is different
- Developers
- Grinders - product/feature-obsessed. Iterative. “Move fast, break things”
- Explain how process X is better to communicate
- Use TDD
- Tour Guides - Works across the system/stack. Helps others
- Ask how is something the way it is.
- 50/50 split between legacy and sharing knowledge. write it down and spread the knowledge
- Geniuses - Thinkers, worriers. Focused on quality and experimentation
- Ask how can we iterate towards it being better
- Make sure they don’t want to completely throw out the current system
- T-shaped people - Focused on one area of expertise.
- they might be so-so at everything else except that one thing.
- share that particular knowledge. teach their weaknesses
- Fun leader
- People skills with some tech knowledge
- They can’t fix every people problem. let them cool down.
- Grinders - product/feature-obsessed. Iterative. “Move fast, break things”
- Learn about your self and know your own tendencies
- Know what you need to recharge yourself. Know how to humor people
- Know when your personality type is in conflict with the other
- Empathy - Understand the other and their current situation and next move
- Can be difficult online.
- Why would someone say that?
- Hanlon’s razor: stupid over malice
- Occam’s razor: simple over complex
- There is another person on the other end of the text box
- “An adult is someone who has been around longer than one hype cycle”
- “Upstart is 1-3 years” and wants to rewrite the world