Author Archives: admin

Database backup load analysis

Twitter Facebook

CodeGuard Database Backup Load Testing
Version: 1.1 / Date: 25 April, 2013

Overview

The goal of this test is to determine what impact the CodeGuard service has on a host server during a MySQL database backup. The results contained in this document were gathered after a series of four tests paralleling the tests previously done on FTP and SFTP backups by Jonathan Manuzak (see other document). They are by no means conclusive and real-world results will vary based on the composition of the websites being backed up and the hardware and software running the underlying host server.

Test Methodology

From the perspective of the CodeGuard backup service, there is a single phase that occurs on a remote server: running the ‘mysqldump’ command. Running this command includes using the MySQL client, either remotely or tunneled over SSH, to simultaneously extract the database tables to a flat file format and transfer that file from the remote server to the CodeGuard service. Currently ‘mysqldump’ commands are run with the options ‘–quick’, ‘–single-transaction’, and ‘–skip-extended-insert’. These configure the operations one row at a time, attempt to ensure consistent state of some types of tables, and avoid multiple-row insert statements, respectively. Executing a mysqldump with different options may result in different performance characteristics.

Pull Definition

Pull: All database backups are of this type. The entire database is downloaded in via the ‘mysqldump’ command, committed to a git repository, and uploaded to Amazon S3. Unlike website file backups, incremental downloads are not supported by mysqldump, so the entire database is downloaded from the remote server each time, although the backup is ultimately incremental since it is stored as a commit in a git repository. Test Database Only one database was used for this suite of tests. In cases where concurrent backups were taking place, the same database was downloaded simultaneously. While a somewhat contrived test setup, this to some approximation simulates other processes accessing database tables concurrently with backups.

Test Database

Only one database was used for this suite of tests. In cases where concurrent backups were taking place, the same database was downloaded simultaneously. While a somewhat contrived test setup, this to some approximation simulates other processes accessing database tables concurrently with backups.

Database Statistics
Size: 2723 MB
Row Count: 856262 (21 tables)
Type: Real MySQL database from language-learning website.

Test Host
Server: The host used for testing was a RackSpace CloudServer.
OS: CentOS 6.3
MySQL Server: 5.1.67
Memory: 512MB CPU Cores: 1

Database Backup Testing Results

The graphs below illustrate the results of each test. Following each are notes discussing the findings.

Metrics and Definitions

  • CPU usage: System: Percentage of CPU usage by system processes.
  • CPU usage: User: Percentage of CPU usage by user processes.
  • % Memory Used: Percentage of system memory used.
  • eth0 in: Network transfer in from the public network connection in KB/s.
  • eth0 out: Network transfer out to the public network connection in KB/s.
  • Server Load (Last 5 Minutes): The numeric representation of the load on the system for the last five minutes. This is a unitless amalgamation of different metrics but, for this system with a single core processor, loads less than 1.0 are acceptable. Loads above 1.0 indicates that processes are waiting for CPU access. More information can be found here: http://blog.scoutapp.com/articles/2009/07/31/understanding-load-averages
  • I/O Average Queue Size: Weighted number of milliseconds spent doing I/Os. This can provide a measure of both I/O completion time and the backlog that may be accumulating.
  • I/O Wait: Time in milliseconds spent waiting to perform I/O operations.
  • I/O Reads / second: Number of file system reads per second.
  • I/O Writes / second: Number of file system writes per second.

1. One Backup – MySQL Direct Connection

Database Backup Load 1
Figure 1a. Server load graph. View on ScoutApp

Database Backup 1b
Figure 1b. Server I/O graph. View on ScoutApp

Notable Times
MySQL Dump Start/End: 6:55AM/7:22AM

As shown in Figure 1, mysqldump is neither CPU-intensive or memory-intensive for a single database backup. Server load remains below 0.1. I/O is affected more significantly, as mysqldump is a read-intensive operation. Reads peak at 55/s with latency topping 7 ms. Ethernet output (eth0 out) averages between 1500 and 2300 kB/s. The entire process completes in 27 minutes.

2. Five Concurrent Backups – MySQL Direct Connection

Database Backup 2a
Figure 2a. Server load graph. View on ScoutApp

Database Backup 2b
Figure 2b. Server I/O graph.  View on ScoutApp

Notable Times
MySQL Dump Start/End: 10:28AM/1:33PM

Concurrent backups show a similar pattern with the main impact of concurrency causing all backups to take more time. Peak latency peaks at 13 ms, higher than the single backup, but peak reads/s are actually lower likely due to inefficiencies introduced by multiple processes competing for I/O access leading to more HD seeking. Network output also varies between 1500 and 2300 kB/s with a sharp break at half an hour into the backup lasting for 5 minutes. I do not yet have a good explanation for the pause in the backup process. Network output also lags the start of the backups by about 20 minutes. This means that transferring of the mysqldump file to CodeGuard is not occurring during this time.

The entire backup process completes in 3:05. This is 1.37 times the length of 5 consecutive 27 minute backups, giving an idea of the performance impact of the competition for the shared database resource. However, server performance does not appear to be significantly impacted.

3. One Backup – MySQL Tunneled over SSH

Database Backup 3a
Figure 3a. Server load graph.View on ScoutApp

Database Backup 3b
Figure 3b. Server I/O graph.View on ScoutApp

Notable Times
MySQL Dump Start/End: 9:28AM/9:53AM

A single backup tunneled over SSH mirrors the performance of a MySQL direct backup closely for all metrics. Completion time is very slightly faster at 25 minutes, although this was not tested for reproducibility. See discussion of the single MySQL direct backups for more information of the overall metrics.

4. Five Concurrent Backups – MySQL tunneled over SSH

Database Backup 4a
Figure 4a. Server load graph. View on ScoutApp

Database Backup 4b
Figure 4b. Server I/O graph.View on ScoutApp

Notable Times
MySQL Dump Start/End: 1:47PM/4:37PM

Multiple concurrent MySQL backups tunneled over SSH perform slightly better than concurrent MySQL direct backups. Notably, the pause in the process seen at 40 minutes in direct backups does not occur.  Peak I/O wait time is 6.3 ms vs 13.2 ms for MySQL direct. Network average transfer speeds are lower, possibly due to compression on the SSH tunnel as well as the lack of a break in the process. Despite the lower network speeds the backups still completed slightly faster at 2:55 vs. 3:05 for MySQL direct.  Similarly to the MySQL direct test, the concurrent backups completed in 1.4x the time of 5 sequential 25-minute backups.

Conclusion

These tests of database backup performance produced solid high-level information on the performance metrics that predict performance of a mysqldump, as well as the load it places on a server, but given the simulated nature of the test system they must be viewed with some skepticism.

Given the above caveat, analysis of the test results shows that:

  • MySQL database backups are almost entirely I/O bound with very little CPU or memory usage.
  • MySQL direct and SSH tunnel behavior is almost identical for single backups.
  • Concurrent MySQL backups tunneled over SSH are slightly faster than MySQL direct backups and appear to have better sustained performance and lower impact on remote server performance.
  • There is a slight penalty to running concurrent backups of the same database vs. sequential backups.

Further questions not fully addressed by this testing:

  • What is the real world (i.e. user) impact of the elevated load and I/O wait times?
  • Would using dedicated hardware change the outcome?
  • What is the impact of backups to real world MySQL database performance. Does backing up certain tables cause significant slowdown in database performance that would impact user experience?
  • Would different MySQL dump options cause different performance profiles and are there ways to optimize the process further with compression of the data either over SSH tunnel or with MySQL?
  • It is possible that the limiting factor is network bandwidth in some phases of the backup. It is unknown what the impact to I/O performance would be in a system with a higher network bandwidth to I/O bandwidth ratio. 

-Randall McPherson, Sr. Engineer

Free Plan Discontinuation; Free Trials Still Here

Twitter Facebook

Yesterday was a tough day for the CodeGuard team. After months of deliberation and discussion, we finally made the decision to stop supporting our grandfathered free user base. This will come as disappointing news to some, so I wanted to take a moment to explain how this came about.

Several months ago, we ceased to offer free plans and instead offer free trials. Since our public beta launch at TechCrunch Disrupt – NYC in May 2011, we have accumulated tens of thousands of users, all of whom we have supported with automatic website backup, stored on some of the world’s most reliable servers. Providing this service free of charge – storing terabytes of data, and performing billions of file examinations daily, has been our pleasure, but for us to be able to support our rapidly growing paid customers and produce the new features that they want, we had to make the tough decision to shift resources away from our free grandfathered users. The two main constraints are cost & time.

Backup & monitoring isn’t cheap, when done right

It costs us money every day to run servers to perform the backups, for databases to store the website and file information, and for extremely redundant storage to keep your data safe. Our free users have grown at a very fast rate and at this point, the cost of providing this service for our free user base can no longer be subsidized by our paid users.

Time is the most precious resource

The second constraint is time. As a fast moving and lean team, time is our most precious resource. We have always been generous with our time when a user contacts us. For years we have afforded our free users the same level of support and assistance as our Professional and Enterprise users. If someone needs help because their site was hacked and they’re not sure what files to restore, we’re not going to turn them away. If a customer needs help configuring their server firewall to allow CodeGuard to connect, we’re not going to tell them we can’t help. On more than one occasion, we’ve gone so far as to purchase and configure an account at the customer’s host to help troubleshoot their issues. Unfortunately, while these are all things we want to help with, it takes away time that one of our engineers could be using to implement a new feature or otherwise improve our service.

What now?

So far the response to this change has been very positive and many former free users are now happy paying customers. For those that have had questions about the rationale for this change, I hope that this provides a bit more insight into the reality of our situation.

CodeGuard Developer Internship Program

Twitter Facebook

Who are you?

You: You’re a brilliant student that is either pursuing a degree related to Computer Science or are extremely interested in web development. You’re a go getter, an if-it’s-broken-I’ll-be-darned-if-I-can’t-fix-it-myself type of person. The word “unproductive” makes you cry a little on the inside. You’ve got a head that is constantly spinning with ideas. You want to write real code, for real applications, for real people. Who are we to stop you?

Who are we?

CodeGuard: an Atlanta startup with close ties to the Georgia Tech community, paving the way for website and database backup excellence. If you are looking for a job where you sit bored in a cubicle and twiddle your thumbs all day, this isn’t it. If you’re looking to not get your hands dirty in code, read no further. What we have for you is a unique summer opportunity where you are going to be part of a team that is writing and pushing code for applications that YOU envision.

How will the program be organized?

Glad you asked!

A brief summary of the program goes something like this: 12 weeks, ~40 hours/week, teams of 2-3 (either self selected or randomly assigned), teams work on a project of choice. Weekly 30 minute reports and meetings with our lead developers will keep your project moving. Options for projects include: xyz. The ultimate goal of the program is to prototype a solution by the end of June, and then spend the second half of the summer testing and debugging it in order to deploy by the end of the summer.

For more details, check out our developer program whitepaper HERE.

What will you gain from this program?

Very few things are more highly valued in the job market than practical application of knowledge and demonstrable accomplishments, both of which you will acquire through this program.

To touch on a few highlights:

  1. Work in an agile development environment where you will be assigned to real projects that need real, implementable solutions
  2. Learn about business strategy as it pertains to technical startups, and see how our dev team works with other departments to continue to iterate toward the customer’s desired solution.
  3. Have some programmer friends just itching to try their hands, too? Work on a team of 2-3 with them; there is power in numbers!
  4. Learn terminology and best practices as you engage in our dev team SCRUM and development processes.
  5. Rare opportunity to actually deploy code! If your solution fits the bill, your code will be deployed and utilized by CodeGuard.

Apply today!

Use our online application by clicking HERE. Forward your resume to Alex Roe at alex.roe@codeguard.com. If you are selected to move forward, you will receive an invitation to come into our office for a brief interview and technical assessment. 

Questions?

Email Alex Roe at alex.roe@codeguard.com

CodeGuard Marketing Management Internship Program

Twitter Facebook

We here at CodeGuard are generally excitable and passionate people, but the office has been buzzing with eagerness and anticipation lately as we begin preparations for our first annual Marketing Management Internship Program, an 8 week professional work experience and training program. Below, you will find out who we are, what the program entails, and how to apply.

Who/what is CodeGuard, and where do you fit in?

CodeGuard is an Atlanta-based startup, paving the way to website and database backup excellence. We may be young, but due to a fantastic combination of hard work and timing, we continue to experience explosive growth. We are very involved with the Atlanta and Georgia Tech communities; our CEO and lead developer are Georgia Tech grads, all of our current interns attend Georgia Tech, and we are very active in the local technology scene. This emphasis on our community has inspired our program- Georgia Tech is saturated with talent that is looking for differentiating and truly valuable experiences. You’re that talent, and CodeGuard is that experience.

If you’re not a business student, don’t worry! The program isn’t tailored specifically to business students so much as to motivated, bright, and hardworking students that are interested in startups or technology services; we won’t assume you have any prerequisite knowledge beforehand. As David, CodeGuard’s CEO and a mechanical engineer grad, will tell you: engineering professionals need marketing knowledge as much as anybody else. In today’s ever-changing career landscape, you need to be well-rounded, and you need to understand the many facets of businesses. Don’t be shy- give yourself the edge necessary to stand out from your peers!

Before we go any further, let me introduce you to the current Marketing Interns:

Sarah Biggers
Image and video hosting by TinyPic

“My favorite thing about working at CodeGuard is the exposure to so many different parts of running a business. As someone who wants to own a startup someday, the mentor-mentee relationship with the employees here has been invaluable. Every day I accomplish something that I had no idea how to even start on a few days prior. It has changed the way I approach my own capabilities and challenges.
The program is a really unique opportunity for students who want that extra edge in the job market. Plus, I’ve never seen a program schedule this flexible, which opens alot of doors for students to work other jobs or take classes.”
Julianne Burch
Image and video hosting by TinyPic

“Seeing how so many pieces of a young business come together in order to find success has been really exciting for me. CodeGuard has given me so much insight into business practices outside of marketing that I know will help me succeed in any area.


The program is really cool because it’s a perfect combination of classroom style training and professional execution. Plus, you can work with your friends! If this opportunity was in front of me, I would take it in a heartbeat.”
Taylor LeBlanc
Image and video hosting by TinyPic

“My favorite thing about working at CodeGuard is how much I get to be involved with. Working at CodeGuard has challeged me to learn various cutting edge marketing functions while getting a grasp on the IT side of our product, and even getting my hands dirty with some maintenance projects around the building.


The program is a great idea because it will challenge you past your comfort zone and push you to learn each week. This isn’t an internship where menial tasks are the norm- you really get to be a part of the company working here.”
Olivia Hill
Image and video hosting by TinyPic

“Working with such incredible, brilliant people would have to be my favorite part of working at CodeGuard. The people make the whole working and learning environment to be so enjoyable and they really make me feel like I’m apart of the team.
It’s such an incredible opportunity to get real world experience, without being completely thrust into the real world without a parachute. This program allows to you learn and grow and really discover what it’s like to work in a fast-paced, expanding company.”

Now, onto the good stuff!

Overview of the Internship Program

The Marketing Management Program is a flexible, 10-week (but you get July 4th off!) course credit internship and training program geared toward any year student, in any major, with any career goal. It is designed to give participants real world experience in many areas of marketing, and to help differentiate them in the job market. A brief overview goes something like this: you will be organized in teams, either self-selected or randomly assigned. Each week there will be a required workshop led by CEO David Moeller which will focus on one topic (social media, marketing automation systems, direct sales, etc.). You will then be given an assignment relating to that topic to complete by the next workshop with the goal of producing something that CodeGuard can use in it’s marketing plans. Highest performers and achievers will be recognized and encouraged to ask for more time in-office and extra projects. This is very much so a “what you get out is what you put into it” type of working opportunity; the possibilities within CodeGuard are virtually endless. For more specific details, check out our internship program whitepaper HERE

Here’s why you should do it:

Ultimately, the Marketing Management Program is the product of classroom education and knowledge acquisition meeting real world execution and business application. It is the perfect combination of an internship and training program. This type of experience is exactly what future employers are looking for in their potential employees. To be more specific…

  1. You are participating in an Internship and graduating with a certificate of completion of our training program. Two birds, one stone.
  2. Flexibility. You can work a separate paying job if you need the money, take a summer semester of classes, or select the program for course credit as a MGT free elective (if you’re a business student) and spend the rest of your time relaxing in Atlanta, all while mastering your Internship and graduating from the program. Win, win, win.
  3. No intern busy work – learn real strategy, execute to real potential clients, and gather real tangible deliverables for your resume. 
  4. Push aside curriculums that teach you what you “must learn” but never actually need or use; instead, immerse yourself in a program that teaches you what you will need to succeed in the business world.
  5. Sign up alongside your friends and work with them all summer!
  6. Any major can benefit from this experience – it is one of those experiences that looks great on paper and speaks even better during interviews.

Apply Here!

Apply using this online application - make sure you email your resume to sarah.biggers@codeguard.com

Still have questions? Want more information?

Email sarah.biggers@codeguard.com, and I will respond as soon as possible!

-Sarah

Joomla Backup: CodeGuard at Joomla Day North Carolina 2013

Twitter Facebook

Recently, CodeGuard sponsored and attended Joomla Day NC at the Fuqua School of Business in Durham, North Carolina. We were thrilled to sponsor this event and have the opportunity to talk with Joomla developers, designers and aficionados about their Joomla backup needs.

According to builtwith.com, Joomla powers over 2% of the top million websites. With statistics like that, it’s not surprising that we’ve seen an increase in the number of Joomla backups in the CodeGuard ecosystem. CodeGuard was intentionally designed to be a platform agnostic solution, but we recognize the popularity of content management systems like Joomla, WordPress and Drupal. As a result, we are always looking for ways to improve the website backup experience for individuals and organizations using tools like Joomla to manage their content.

At several points during the day, we had the opportunity to talk with Joomla developers, designers and users. These conversations reinforced for us the value that the CodeGuard file monitoring and ChangeAlert features can provide as part of a Joomla backup. Specifically, by identifying and notifying website owners about all file changes occurring on their site, they are able to act quickly in the event of potential tampering or malicious behavior occurring on their websites.

We mostly spent our time at the conference listening to feedback and asking questions about Joomla backup needs. However, David was invited to participate in a security panel and answer questions from the audience. It was clear based on the questions about SSL, vulnerabilities and malware that website security was a real, every day concern for this group.

All four panelists reinforced that the best security measures are some of the easiest:

1. Have a well secured server or use a hosting provider with good security practices.
2. Keep your Joomla installation and extensions up to date.
3. Always have an off-site backup of your content and database.

More information about Joomla specific updates and security notifications can be found here: http://developer.joomla.org/security

Thank you again to all of the other sponsors, organizers and attendees for making Joomla Day NC a reality. See you next time!