f



Efficient way to delete a large amount of records from a big table

Hello,

I want to delete a large amount of records (~200K) from a large table
(~500K records) in my MySql DB.

I want to make this call as efficient as possible because i dont want
the DB to become "unresponsive" while executing the call. I need to
delete records that are "older" than 10 days (according to created_at
column), currently I use:

delete from table_name where created_at < DATE_SUB(CURDATE(),INTERVAL
10 DAY)
The table also have a primary key id if it helps.

Any thoughts?
0
Ran
3/25/2011 3:06:43 PM
comp.databases.mysql 6944 articles. 1 followers. laredotornado (869) is leader. Post Follow

6 Replies
8044 Views

Similar Articles

[PageSpeed] 47

On 3/25/2011 11:06 AM, Ran Margaliot wrote:
> Hello,
>
> I want to delete a large amount of records (~200K) from a large table
> (~500K records) in my MySql DB.
>
> I want to make this call as efficient as possible because i dont want
> the DB to become "unresponsive" while executing the call. I need to
> delete records that are "older" than 10 days (according to created_at
> column), currently I use:
>
> delete from table_name where created_at<  DATE_SUB(CURDATE(),INTERVAL
> 10 DAY)
> The table also have a primary key id if it helps.
>
> Any thoughts?

Your primary key would have no effect in a case like this.

First thing to do is to EXPLAIN your query and see if there are any 
indexes being used (i.e. is there an index on created_at?).

Also: How often are you running this query?  How long does it take to 
run currently?  What is the age range of the rows you want to delete?


-- 
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex@attglobal.net
==================
0
Jerry
3/25/2011 3:11:42 PM
On Mar 25, 3:06=A0pm, Ran Margaliot <ran5...@gmail.com> wrote:
> Hello,
>
> I want to delete a large amount of records (~200K) from a large table
> (~500K records) in my MySql DB.
>
> I want to make this call as efficient as possible because i dont want
> the DB to become "unresponsive" while executing the call. I need to
> delete records that are "older" than 10 days (according to created_at
> column), currently I use:
>
> delete from table_name where created_at < DATE_SUB(CURDATE(),INTERVAL
> 10 DAY)
> The table also have a primary key id if it helps.
>
> Any thoughts?

Depending on what the data in the table looks like you could use an
index on the created_at column. The may (or may not) improve the
efficiency.

One thing to bear in mind is that ensuring that the database does not
become "noticeably" unresponsive is not necessarily dependent on the
efficiency of the query. You could run multiple queries over time with
a LIMIT clause. This would not make the query any more efficient
(indeed you might say that it makes the "process" more in-efficient),
but it would prevent the database being tied up with this one task for
too long.
1
Captain
3/25/2011 3:32:01 PM
    Right.  created_at should be an indexed field to speed this up.

    Also, if you're using InnoDB, and created_at is an indexed
field, the operation which is purging old records generally shouldn't
interfere with operations that are adding or working on new
records.

    500K records isn't a large database.

				John Nagle

On 3/25/2011 8:11 AM, Jerry Stuckle wrote:
> On 3/25/2011 11:06 AM, Ran Margaliot wrote:
>> Hello,
>>
>> I want to delete a large amount of records (~200K) from a large table
>> (~500K records) in my MySql DB.
>>
>> I want to make this call as efficient as possible because i dont want
>> the DB to become "unresponsive" while executing the call. I need to
>> delete records that are "older" than 10 days (according to created_at
>> column), currently I use:
>>
>> delete from table_name where created_at< DATE_SUB(CURDATE(),INTERVAL
>> 10 DAY)
>> The table also have a primary key id if it helps.
>>
>> Any thoughts?
>
> Your primary key would have no effect in a case like this.
>
> First thing to do is to EXPLAIN your query and see if there are any
> indexes being used (i.e. is there an index on created_at?).
>
> Also: How often are you running this query? How long does it take to run
> currently? What is the age range of the rows you want to delete?
>
>

0
John
3/28/2011 5:12:36 AM
El 25/03/2011 16:06, Ran Margaliot escribi�/wrote:
> I want to delete a large amount of records (~200K) from a large table
> (~500K records) in my MySql DB.
>
> I want to make this call as efficient as possible because i dont want
> the DB to become "unresponsive" while executing the call. I need to
> delete records that are "older" than 10 days (according to created_at
> column), currently I use:
>
> delete from table_name where created_at<  DATE_SUB(CURDATE(),INTERVAL
> 10 DAY)
> The table also have a primary key id if it helps.
>
> Any thoughts?

The DELETE statement has LOW_PRIORITY and QUICK switches. They look like 
the first approach to test if you are using MyISAM:

�If you specify LOW_PRIORITY, the server delays execution of the DELETE 
until no other clients are reading from the table. This affects only 
storage engines that use only table-level locking (such as MyISAM, 
MEMORY, and MERGE).

For MyISAM tables, if you use the QUICK keyword, the storage engine does 
not merge index leaves during delete, which may speed up some kinds of 
delete operations.�

http://dev.mysql.com/doc/refman/5.1/en/delete.html

See also http://dev.mysql.com/doc/refman/5.1/en/delete-speed.html


-- 
-- http://alvaro.es - �lvaro G. Vicario - Burgos, Spain
-- Mi sitio sobre programaci�n web: http://borrame.com
-- Mi web de humor satinado: http://www.demogracia.com
--
0
ISO
3/28/2011 7:11:22 AM
> I want to delete a large amount of records (~200K) from a large table
> (~500K records) in my MySql DB.

For "large" read "small".

> I want to make this call as efficient as possible because i dont want
> the DB to become "unresponsive" while executing the call. I need to

Efficiency may not be the issue here.  If the DELETE query keeps the
table locked, it may block queries that modify that table.

I recall having a problem with this on an older system, expiring
old login records.  (Perhaps deleting 20 million records out of 250
million daily).  Initially, with once-a-day expiration, it locked
up the table for 45 minutes.  Not good.  As this was a 24x7 operation,
doing it overnight still caused problems.  Some caveats on this example:

1)  It used 10-year-old hardware.
2)  It used MyISAM tables (InnoDB wasn't around or not mature enough then).
    InnoDB I believe uses more selective locking.
3)  Finding the records to delete wasn't the issue (with an appropriate
    index); it seemed that actually deleting the records was taking the time.
4)  MySQL has improved a lot since then.
5)  Running the query more often, and with LIMIT, divided the lockout 
    into shorter lockouts that were more tolerable.  However, from the
    point of view of server load, this was a LOT less efficient, as it
    was finding the records 100 times a day instead of once.


DELETE LOW_PRIORITY wouldn't help much, as once it started deleting, it
still held a lock for 45 minutes.

> delete records that are "older" than 10 days (according to created_at
> column), currently I use:
> 
> delete from table_name where created_at < DATE_SUB(CURDATE(),INTERVAL
> 10 DAY)

I presume you have an index on created_at.

> The table also have a primary key id if it helps.

I expect the primary key will not help for this query.  Actually,
it will hurt (minimally) because it has to delete entries out of
the primary key.  Don't remove your primary key because of this,
though.  Presumably you actually use this data with other queries
rather than just expiring it.
0
3/29/2011 1:36:21 AM
On 3/28/2011 6:36 PM, Gordon Burditt wrote:
>> I want to delete a large amount of records (~200K) from a large table
>> (~500K records) in my MySql DB.
>
> For "large" read "small".
>
>> I want to make this call as efficient as possible because i dont want
>> the DB to become "unresponsive" while executing the call. I need to
>
> Efficiency may not be the issue here.  If the DELETE query keeps the
> table locked, it may block queries that modify that table.
>
> I recall having a problem with this on an older system, expiring
> old login records.  (Perhaps deleting 20 million records out of 250
> million daily).  Initially, with once-a-day expiration, it locked
> up the table for 45 minutes.  Not good.  As this was a 24x7 operation,
> doing it overnight still caused problems.  Some caveats on this example:
>
> 1)  It used 10-year-old hardware.
> 2)  It used MyISAM tables (InnoDB wasn't around or not mature enough then).
>      InnoDB I believe uses more selective locking.
> 3)  Finding the records to delete wasn't the issue (with an appropriate
>      index); it seemed that actually deleting the records was taking the time.
> 4)  MySQL has improved a lot since then.
> 5)  Running the query more often, and with LIMIT, divided the lockout
>      into shorter lockouts that were more tolerable.  However, from the
>      point of view of server load, this was a LOT less efficient, as it
>      was finding the records 100 times a day instead of once.

     That's a big delete load.  You could measure the expiration time in
seconds, and spread out the deletion load that way, with about 2 minutes
of deletion per hour.  With an index on the timestamp, InnoDB's row
locking, and a 30-second deletion cycle every 15 minutes, the impact
shouldn't be all that bad.

					John Nagle
0
nagle (1119)
3/30/2011 5:53:44 PM
Reply:

Similar Artilces:

Delete a record from a table in a Database with Database Connectivity
I can't find the function from the Database Connectivity library that allows me to delete a record from a table of the database. I tried to use DB Tools Free Object.vi function but this one deletes only the table created with a query. &nbsp; Thankyou Hi, I have the same problem to delete a record&nbsp;from a table in a Database. I couldn't understand your answers very well. Can anyone give some more&nbsp;details? If you&nbsp;already have got the solution, could you let&nbsp;me know? Thank you very much! Jane&nbsp; I have used the sql query as described above a...

delete five BIG tables and insert new records on same tables
I need to delete five BIG tables and insert new records on same tables. Since it involved lot of computing, it will take long. I'm thinking it will take probably couple of days. What I want is minimal down time, if I have to have little down time. I have sql insert scripts to run. Any suggestions? Thanks. -Jay PC wrote: > I need to delete five BIG tables and insert new records on same tables. > Since it involved lot of computing, it will take long. I'm thinking it > will take probably couple of days. > > What I want is minimal down time, if I have to have lit...

Delete Records From a Table Using Records in Another Table.
I have been searching many postings and I cant seem to find anyone that has this answer so I decided to post. I am using SQL (Transact-SQL).If I have 2 tables with columns acct_num,activity_date,and pay_amt and I want to delete one instance of a record in table 1 for every instance of that record in table 2 how could I do that. For example. Table 1 ----------- acct activity_date pay_amt 123 5/1/2004 50.00 123 5/1/2004 50.00 123 5/1/2004 50.00 123 5/1/2004 50.00 123 5/1/2004 50.00 Table 2 ----------- acct activity_date pay_amt 123...

fastest way to delete in very large table
I have 2 tables, each with close to 4 million records. They have one field in common (VIN). I want to delete each row from table 2 where vin is also in table 1. I have indexed vin in both tables. No matter how I try to word this, it is taking an extraordinary amount of time. Optimization is set on. I'm running a P4 with 512 memory (which isn't a ton, but shouldn't be bad) and VFP 7.0. What is the ultimate way to do this for fastest results. I have 4 more to do like this, so it is a major order. Thanks. Monica Simple really. I use the following method if S...

Efficient way of deleted large directories
Well I got myself into quite a mess. We have a vdump tape backup, that saved cloned filesets. The vdump was taking soo long it was still going when purge routines kicked into life by cron. So we were backing up directories whilst trying to delete old files from them. Things slowed down. It seems the more activity there is on a cloned fileset the longer it takes to remove the clone, am I right? So I had directories with upwards of 500,000 files in them. Doing ... ls -1 >filelist xargs -i rm -f {} <filelist takes a long time. Would it be more efficient to reverse the...

Deleting records in one table having a matching related record in another table?
DELETE tblPreliminaryVINs.* FROM tblPreliminaryVINs INNER JOIN tblVehicleJobs ON tblPreliminaryVINs.PVIN = tblVehicleJobs.SerialNum; The above SQL does not work for me. I get an error I cannot delete the record(s) because of READONLY and PERMISSIONS related issues. I can hilite (select) the record in the table and delete it. I can use the following SQL to delete record(s) in tblPreliminaryVINs DELETE tblPreliminaryVINs.* FROM tblPreliminaryVINs; But I really only want to delete records in tblPreliminaryVINs whose [PVIN] field has a matching value somewhere in blVehicleJobs's [SerialNum] ...

Delete one instance of a record in a table for each instance of that record in a 2nd table.SQL
I have been searching many postings and I cant seem to find anyone that has this answer so I decided to post. I am using SQL (Transact-SQL).If I have 2 tables with columns acct_num,activity_date,and pay_amt and I want to delete one instance of a record in table 1 for every instance of that record in table 2 how could I do that. For example. Table 1 ----------- acct activity_date pay_amt 123 5/1/2004 50.00 123 5/1/2004 50.00 123 5/1/2004 50.00 123 5/1/2004 50.00 123 5/1/2004 50.00 Table 2 ----------- acct activity_d...

Deleting duplicate entries from MySQL database table
The topic is related to MySQL database. Suppose a table "address" contains the following records ------------------------------------------------------- | name | address | phone | ------------------------------------------------------- | mr x | 8th lane | 124364 | | mr x | 6th lane | 435783 | | mrs x | 6th lane | 435783 | | mr x | 8th lane | 124364 | ------------------------------------------------------- >> Execute single query (MySQL Version: No Restriction), with sub-query or some other method >> After executing the query -------------------...

How to delete a MySQL database table without system()
I want to delete a MySQL table without using functions of the OS, so that my php script can run on any webserver. Can I use a command like $sql = "DELETE FROM tablename"; or are there better ways to do the job? Thanks for answers, Hartmut "Hartmut Jaeger" <hartmut.jaeger@jaeger-edv-service.de> kirjoitti viestiss�:1123410202.130280.220900@g14g2000cwa.googlegroups.com... >I want to delete a MySQL table without using functions of the OS, so > that my php script can run on any webserver. Can I use a command like > > $sql = "DELETE FROM tablename&quo...

what is the most efficient way to join big tables by numeric variables?
Hi guys, I have several datasets which have over 100 million records. The link variable (ID) is numeric. If I want to do something like "exclude c1 from b1" proc sql; create table a1 as select * from b1 where ID not in (select ID from c1); quit; proc sql; create table a1 as select * from b1 where not exist (select c1.ID from a1, c1 where a1.ID=c1.ID); quit; It is way too slow. If ID is a character variable, then it is much better. What is the reason? Is there any better way to make it fast? Thanks for your help. Tony ...

Finding number of records in a database table from SAS efficiently
Hello, I am trying to find the efficient way to find the number of rows in a database table (DB2) efficiently using SAS. I tried ATTRN function, not working when I assigned a library to the schema. I tried END option in datastep, taking too long time (CPU Time:3.4 minutes, REAL time: 17 min for about 132 million records). I assume other ways i know of would take same amount of time. Please suggest the best and efficient way to find out the number of observations. Thanks On 07/09/2012 16:18, sas.consultant.ad@gmail.com wrote: > Hello, > > I am trying to find...

How to find the number of record of a database table from SAS efficiently
Hello, I am trying to find the efficient way to find the number of rows in a database table (DB2) efficiently using SAS. I tried ATTRN function, not working when I assigned a library to the schema. I tried END option in datastep, taking too long time (CPU Time:3.4 minutes, REAL time: 17 min). I assume other ways i know of would take same amount of time. Please suggest the best and efficient way to find out the number of observations. Thanks Hello again, I forgot to mention the number of records for the CPU and real times with END option; it is about 132 million. So I...

Best way to find a record in a table with a large number of entries?
It's a database keeping info about people and is expected to contain 20,000 to 25,000 people. I could put a GoToPerson Combo on the form, but the customer and I are not to keen on that. It makes finding a person difficult for casual users. I'm thinking of a pop-up search form that will allow some flexibility in searching and also display results in a subform (e.g. which J Smith). What do you think? -- Regards. Richard. Have you shown your customer how a combobox with autoexpand turned on works? You just begin typing in the last name and the list automatically scrolls to the first l...

What is the most efficient way of passing large amounts of data through several subVIs?
I am acquiring data at a rate of once every 30mS. This data is sorted into clusters with relevant information being grouped together. These clusters are then added to a queue. I have a cluster of queue references to keep track of all the queues. I pass this cluster around to the various sub VIs where I dequeue the data. Is this the most efficient way of moving the data around? I could also use "Obtain Queue" and the queue name to create the reference whenever I need it. Or would it be more efficient to create one large cluster which I pass around? Then I can use unbundle by ind...

Is there a way to delete every record in a table without selecting all the rows
Hi, I have a table with 1 million rows that I need to delete. Is there a way to delete them quickly considering I get errors if I try to select them all and then hit delete? What I'm looking for is some sort of "delete all records" menu option. Thanks Don't try to do this from datasheet view. Create a Delete Query in the Query Builder and execute it: Add the Table to the Query, drag down the * into the grid, then on the menu, click Query and choose Delete. Execute it with the ! icon on the toolbar. Don't be impatient, as with a million rows, it will take some time. ...

Newbie: VB6 Deleting Combo Box Records from Database Table
Hi, I'm using VB6 and SQL 2000. I have searched the google.com and the google groups for hours to find this solution. I have a application with will allow administrators to add/delete users. To delete the user, I would like for them to select the user from the combo box. My problem is I don't know how to make the selected item of the combo box to match up with the record in the database table. I have posted my code for the delete command button below. Also, note that the For Each statement worked in another application which used a datagrid to delete items. The property was dat...

Q: efficient way to check the number of records in a large dabase with JE
Hi, there I am starting to use berkeley db java edition for testing datamining algorithm on a large dataset. I found it very slow to check the number of records in my database using the cursor: DatabaseEntry key = new DatabaseEntry(); DatabaseEntry data = new DatabaseEntry(); Cursor cursor = database.openCursor(null, null); while(cursor.getNext(key, data, LockMode.DEFAULT) == OperationStatus.SUCCESS) count++; cursor.close(); return count; Could anybody tell me an efficient method to get it? BTW, there are about 8 million records in the database and take about 60G datab...

What's fastest way to get the total number of records in a big table?
Hi, Can someone tell me what's fast way to get total of records in a table except using "select count(*)" statement? Can I get the information from a system table? Thanks, Andy Andy wrote: >Hi, > >Can someone tell me what's fast way to get total of records in a table >except using "select count(*)" statement? Can I get the information from a >system table? > >Thanks, > >Andy > > There is on other way. You can get it from all_tables but only if it has had no inserts, updates, or deletes since it w...

How to delete duplicate key records from tables in SQL/MP.. Please suggest best way?
*** ERROR from SQL [-1915]: SQL load routines: Duplicate key. >>error 1915 **** SQL ERROR 1915 SQL load routines: Duplicate key. CAUSE: Two or more records have the same key value. Br, Avi Am Dienstag, 6. Januar 2015 08:09:42 UTC+1 schrieb Avinash: > *** ERROR from SQL [-1915]: SQL load routines: Duplicate key. > >>error 1915 > > **** SQL ERROR 1915 > SQL load routines: Duplicate key. > > CAUSE: > Two or more records have the same key value. > > > Br, > Avi You did not give much information about the enviro...

Delete Query-Identifying the table with records for deletion
HI again: I have a query which selects records on tblPOmods on the basis of information on two other tables. I wish to delete the selected transactions, and am having a problem doing it. It tells me that if "could not delete from the specified tables". I have tried to set the "where" to "From" for the fields I selected, but the "From" wont take and bounces back to "where". I have tried other strategies, and got a comment "Identity table to delete", but the "where" still won't change to a "From". Does anyone ...

forbit deleting a record if it has related records in an other table
How can I prevent the deletion of a record if it has related records in FM8.5 pro? In FM6.0 I used the isvalid(othertable::field) function to determin the existance of related records. This doesnt seam to work in 8.5. Any sugestions? In article <erd2k0$ljl$1@mouse.otenet.gr>, "Hatzigiannakis Nikos" <ypai@aigaio.gr> wrote: > How can I prevent the deletion of a record if it has related records in > FM8.5 pro? > > In FM6.0 I used the isvalid(othertable::field) function to determin the > existance of related records. This doesnt seam to work in 8.5....

Working in Delphi 6 Enterprise with tables of a MySQL database, I can't delete rows
Hi, Working in Delphi 6 Enterprise with tables of a MySQL database, I can't delete rows. I explain better the problem. Trying to insert some data, I have made some mistakes, ad so I wanted to delete the new inserted rows To delete tables, I have used a DBGrid control, and with a DBNavigator control, I move to the record to erase, and I delete it, with the delete button of the dbnavigator control. But I get this trange error: Project Pinco has raised exception class EOleException with message "Wrong or insufficient Information about key column. Too much rows intereste...

Question about move large amount of data from database to database
guys, I have a project need to move more than 100,000 records from one database table to another database table every week. Currently, users input date range from web UI, my store procedure will take those date ranges to INSERT records to a table in another database, then delete the records, but it will take really long time to finish this action (up to 1 or 2 hours). My question is if there is some other way I should do to speed up the action, I am thinking about use bcp to copy those records to datafile and then use bcp to insert it into SQL Server table. Is this the right way to do it or ...

Delete record based on existence of another record in same table?
Hi All, I have a table in SQL Server 2000 that contains several million member ids. Some of these member ids are duplicated in the table, and each record is tagged with a 1 or a 2 in [recsrc] to indicate where they came from. I want to remove all member ids records from the table that have a recsrc of 1 where the same member id also exists in the table with a recsrc of 2. So, if the member id has a recsrc of 1, and no other record exists in the table with the same member id and a recsrc of 2, I want it left untouched. So, in a theortetical dataset of member id and recsrc: 0001, 1 0002, 2 ...

Web resources about - Efficient way to delete a large amount of records from a big table - comp.databases.mysql

Efficient frontier - Wikipedia, the free encyclopedia
and others. A combination of assets, i.e. a portfolio , is referred to as "efficient" if it has the best possible expected level of return for ...

Facebook Acquires QuickFire Networks, Which Aims to Make Videos More Bandwidth-Efficient
Just one day after touting the substantial growth it has experienced on the video front , Facebook announced Thursday that it has acquired QuickFire ...

Get efficient installs with video creative and CPA buying for mobile app ads
Today we're offering two new capabilities to mobile app ads that will improve the efficiency of your app installs. Use video creative to engage ...

Involver Releases Social Markup Language for Efficient Facebook App Creation
Social marketing platform provider Involver today released its Social Markup Language which allows anyone with basic HTML, CSS, or javascript ...

Efficient Frontier - LinkedIn
Welcome to the company profile of Efficient Frontier on LinkedIn. Maximum advertising performance and returns. It&#x2019;s what we&#x2019;ve ...

Tevy - efficient text editor for Evernote on the App Store on iTunes
Get Tevy - efficient text editor for Evernote on the App Store. See screenshots and ratings, and read customer reviews.

All sizes - Ecomensajes: Bombillos de Bajo Consumo // Energy Efficient Light Bulbs (12 de 52 y 1/2) ...
Flickr is almost certainly the best online photo management and sharing application in the world. Show off your favorite photos and videos to ...

FY16 NDAA - AGILE. EFFICIENT. READY. LETHAL. - YouTube
Chairman Thornberry gives his take on the FY16 National Defense Authorization Act.

V8 Supercars: Downsized HRT more efficient and hungry for success, says James Courtney - AdelaideNow ...
JAMES Courtney says a new, streamlined Holden Racing Team has the factory outfit poised to turn its Clipsal 500 success into overdue championship ...

Actsmart Business Sustainability Awards recognise efficient Canberra business
A number of Canberra businesses have been recognised for a commitment to saving water, conserving energy and reducing waste&nbsp;with an&nbsp;Actsmart ...

Resources last updated: 3/3/2016 6:39:20 PM