How to use 'lock' prefix to make instruction atomic (in a multi-threaded program)

Hi all,
  I am writing a multi-threaded assembly program with POSIX threads.
There is a global variable that all thread can modify. Each thread may
increment or decrement this variable, global_counter.
  And this is how i wrote for each thread (at&t syntax):
     lock  addl $1, global_counter
     lock  subl  $1, global_counter
     %% Or use 'inc' and 'dec' instruction instead of add/sub
  In the main thread, it requires to read the value of global_counter
atomically, this is what i wrote:
     movl  $0, %eax
     lock  addl global_counter,  %eax

  However, the assembler always complains "illegal instruction". Did i
somehow misuse the lock prefix, or I misunderstood the purpose of the
lock prefix?  Any suggestions on how to get it work?

   Thanks a lot.

cheers,

xiao-lei

0
spamtrap2 (1627)
8/23/2008 3:36:59 PM
comp.lang.asm.x86 4917 articles. 11 followers. Post Follow

10 Replies
632 Views

Similar Articles

[PageSpeed] 27
On Sat, 23 Aug 2008 08:36:59 -0700 (PDT), climber.cui
<spamtrap@crayne.org> wrote:

>Hi all,
>  I am writing a multi-threaded assembly program with POSIX threads.
>There is a global variable that all thread can modify. Each thread may
>increment or decrement this variable, global_counter.
>  And this is how i wrote for each thread (at&t syntax):
>     lock  addl $1, global_counter
>     lock  subl  $1, global_counter
>     %% Or use 'inc' and 'dec' instruction instead of add/sub
>  In the main thread, it requires to read the value of global_counter
>atomically, this is what i wrote:
>     movl  $0, %eax
>     lock  addl global_counter,  %eax
>
>  However, the assembler always complains "illegal instruction". Did i
>somehow misuse the lock prefix, or I misunderstood the purpose of the
>lock prefix?  Any suggestions on how to get it work?
>
Try the lock on a seperate line just above the add or sub
-- 
ArarghMail808 at [drop the 'http://www.' from ->] http://www.arargh.com
BCET Basic Compiler Page: http://www.arargh.com/basic/index.html

To reply by email, remove the extra stuff from the reply address.

0
ArarghMail808NOSPAM
8/23/2008 7:00:09 PM
On Sat, 23 Aug 2008 08:36:59 -0700 (PDT)
"climber.cui" <spamtrap@crayne.org> wrote:

> Did i
> somehow misuse the lock prefix, or I misunderstood the purpose of the
> lock prefix?

The lock prefix is only valid on instructions which modify memory.
Reading from memory neither requires nor allows the prefix.

-- 
Chuck 
http://www.pacificsites.com/~ccrayne/charles.html

0
Charles
8/23/2008 9:10:56 PM
On Sat, 23 Aug 2008 14:10:56 -0700, Charles Crayne
<spamtrap@crayne.org> wrote:

>On Sat, 23 Aug 2008 08:36:59 -0700 (PDT)
>"climber.cui" <spamtrap@crayne.org> wrote:
>
>> Did i
>> somehow misuse the lock prefix, or I misunderstood the purpose of the
>> lock prefix?
>
>The lock prefix is only valid on instructions which modify memory.
>Reading from memory neither requires nor allows the prefix.

The OPs first two examples do modify memory, the third doesn't.
-- 
ArarghMail808 at [drop the 'http://www.' from ->] http://www.arargh.com
BCET Basic Compiler Page: http://www.arargh.com/basic/index.html

To reply by email, remove the extra stuff from the reply address.

0
ArarghMail808NOSPAM
8/23/2008 11:24:54 PM
On Aug 23, 5:10 pm, Charles Crayne  <spamt...@crayne.org> wrote:
> On Sat, 23 Aug 2008 08:36:59 -0700 (PDT)
>
> "climber.cui" <spamt...@crayne.org> wrote:
> > Did i
> > somehow misuse the lock prefix, or I misunderstood the purpose of the
> > lock prefix?
>
> The lock prefix is only valid on instructions which modify memory.
> Reading from memory neither requires nor allows the prefix.
>
> --
> Chuckhttp://www.pacificsites.com/~ccrayne/charles.html

Yes, it works. Thanks.
But, if i wrote :
  cmp  $0, global_counter
Is it always going to be atomic read (I mean, reading the value of
global_counter) ?

xiao-lei

0
climber
8/24/2008 7:17:49 AM
"climber.cui"asked:
....
> But, if i wrote :
>   cmp  $0, global_counter
> Is it always going to be atomic read (I mean, reading the value of
> global_counter) ?

Of course the read is atomic by itself, but any following instruction
(like conditional jump, cmov) may rely on a meanwhile altered value.
So the CPU-designers gave us two variants of fully atomic switches:

____________MP solutions
CHECK_GRANT_HLL:
 mov ecx,false    ;assuming true means 'free' yet.
 mov eax,true
 lock cmpxchg,dword[global_grant_if_free], ecx
  ;eax true means granted and modified to false
  ;eax false means occupied
 ret              ; also possible: jz granted...
___________MP solutions
CHECK_GRANT_ASM:
 mov eax,grant_if_free_bitnr   ;can span a range +/-256MByte
 lock bts dword [global_flags],eax
  ;a set CY means the flag was SET and remains SET(occupied)
  ;NC means it was clear, so granted and SET yet.
 jnc granted
....
__
wolfgang


0
Wolfgang
8/24/2008 9:49:21 AM
"Wolfgang Kern"  <spamtrap@crayne.org> writes:
> "climber.cui"asked:
> ...
>> But, if i wrote :
>>   cmp  $0, global_counter
>> Is it always going to be atomic read (I mean, reading the value of
>> global_counter) ?
>
> Of course the read is atomic by itself, 

Even if global_counter is misaligned? I've known architectures
where the two halves of the fetch from memory (or should I say 
two fetches from memory) would leave room for a race condition, 
and currently have no reason to believe that x86 would be any 
different. However, it's theoretically possible that Intel 
wanted to preserve the apparent atomic nature of such a read.

Phil
-- 
The fact that a believer is happier than a sceptic is no more to the 
point than the fact that a drunken man is happier than a sober one. 
The happiness of credulity is a cheap and dangerous quality.
-- George Bernard Shaw (1856-1950), Preface to Androcles and the Lion

0
Phil
8/24/2008 5:21:26 PM
Phil Carmody wrote:

>>>   cmp  $0, global_counter
>>> Is it always going to be atomic read (I mean, reading the value of
>>> global_counter) ?

>> Of course the read is atomic by itself,

> Even if global_counter is misaligned? I've known architectures
> where the two halves of the fetch from memory (or should I say
> two fetches from memory) would leave room for a race condition,
> and currently have no reason to believe that x86 would be any
> different. However, it's theoretically possible that Intel
> wanted to preserve the apparent atomic nature of such a read.

I sure hope that a 'global counter' will never reside unaligned  ;)
Can't remember where I read about x86 instructions will be never
intercepted in the middle of their job, perhaps already with 286/287 ?
Misaligned/cachebound crossing data will see penalty cycles anyway.

__
wolfgang


0
Wolfgang
8/24/2008 8:34:16 PM
On Sun, 24 Aug 2008 20:21:26 +0300
Phil Carmody <thefatphil_demunged@yahoo.co.uk> wrote:

> I've known architectures
> where the two halves of the fetch from memory (or should I say 
> two fetches from memory) would leave room for a race condition, 
> and currently have no reason to believe that x86 would be any 
> different.

In the 8086/8088 days, this was a potential problem, but since the
introduction of cache memory, the designers have provided ways to
synchronize the caches on two or more processors, and all reads,
and most writes, are atomic. Only those instructions which internally
perform a read->update->write cycle can make use of the LOCK prefix,
although there may be assemblers which will silently discard the
prefix when it is used inappropriately. This has been true since at
least the introduction of the 80386.

-- 
Chuck 
http://www.pacificsites.com/~ccrayne/charles.html

0
Charles
8/24/2008 8:46:56 PM
Phil Carmody schrieb:
> "Wolfgang Kern"  <spamtrap@crayne.org> writes:
>> "climber.cui"asked:
>> ...
>>> But, if i wrote :
>>>   cmp  $0, global_counter
>>> Is it always going to be atomic read (I mean, reading the value of
>>> global_counter) ?
>> Of course the read is atomic by itself, 
> 
> Even if global_counter is misaligned?

Iirc from the document below, global_counter must be
aligned for its read be be atomic.

Intel Memory Ordering White Paper
<www.intel.com/products/processor/manuals/318147.pdf>


Hendrik vdH

0
Hendrik
8/25/2008 5:56:38 AM
In article <20080824134656.2f1018d1@thor.crayne.org>, 
spamtrap@crayne.org says...
> On Sun, 24 Aug 2008 20:21:26 +0300
> Phil Carmody <thefatphil_demunged@yahoo.co.uk> wrote:
> 
> > I've known architectures
> > where the two halves of the fetch from memory (or should I say 
> > two fetches from memory) would leave room for a race condition, 
> > and currently have no reason to believe that x86 would be any 
> > different.
> 
> In the 8086/8088 days, this was a potential problem, but since the
> introduction of cache memory, the designers have provided ways to
> synchronize the caches on two or more processors, and all reads,
> and most writes, are atomic. Only those instructions which internally
> perform a read->update->write cycle can make use of the LOCK prefix,
> although there may be assemblers which will silently discard the
> prefix when it is used inappropriately. This has been true since at
> least the introduction of the 80386.

This isn't really true, but it can look that way most of the time. It's 
all really a question of HOW badly misaligned an item can get before it 
gets read and/or written in a non-atomic fashion.

On a reasonably current machine, if an item crosses a cache line 
boundary, it won't be read/written atomically.

-- 
    Later,
    Jerry.

The universe is a figment of its own imagination.

0
Jerry
9/7/2008 11:33:50 PM
Reply:
Similar Artilces:

Use of cluster
Hi I am not familar with the use of cluster,. See my attahed in verion 7.1 How am I going to manipulate the contols or indicators, I supposed only property node can work, isn't it? &nbsp; Clement cluster1.vi: http://forums.ni.com/attachments/ni/170/177767/1/cluster1.vi If you want to change the values of the cluster elements on the front panel, then you can either write to the Value property of the cluster (or the Value property of the individual cluster elements), write to a local variable of the cluster, or write to the indicator terminal (in the case of a cluster indicator).&...

Having Trouble with File I/O (win32 API) -- Using NASM and ALINK
I'm not sure what I am doing wrong. I run the exe, and a file is not created. I'm still trying to figure out how to debug this thing in windows. I assembled the below code with: nasm -f obj createfile.asm alink -oPE -subsys con createfile.obj Here's my source code: import CreateFileA kernel32.dll import ExitProcess kernel32.dll import CloseHandle kernel32.dll import WriteFile kernel32.dll extern CreateFileA extern ExitProcess extern CloseHandle extern WriteFile %include "win32n.inc" ; downloaded at rs1.szif.hu/~tomcat/win32 -- contains definitions for win32 par...

Question about using split
All I have a string which I want to split on the characters "SA", simple enough, I also want to (in the same expression) split on "sa" just to catch any lowercase characters, however I would like the characters SA/sa to remain in the array after the split, it's simple enough to append it back on, but it is a little clumsy and has caused some problems. Can anyone suggest a solution? PT In article <binuh4$bf0j0$1@ID-116287.news.uni-berlin.de>, Max Adams <rubberducky703@hotmail.com> wrote: : All I have a string which I want to split on the characters ...

updated "Using Python From IDL" ebook
All, I just added a chapter to the ebook on creating tones from IDL using a Pyth= on package called pyaudio. I also found a free plugin for Visual Studio th= at gives you a good IDE for python. (http://pytools.codeplex.com/) This not only gives a nice DE it also allows us to do interactive debugging= with Python from IDL. Of course, I explain it all in the book which can b= e purchased here (http://www.rlkling.com/using-python-from-idl.htm) for $5.= 00 US. If you have already purchased the book you are supposed to get free= updates. Thanks! Ronn Kling On Wednesday, Jul...

How to find free disk space using C
Hi, I am trying to incorporate a code snippet which would calculate the free disk space of any drive passed to it as argument but presently stuck with it. Can anyone help me out with sample code? I have already tried with the follwoing: typedef BOOL (WINAPI *P_GDFSE)(LPCTSTR, PULARGE_INTEGER, PULARGE_INTEGER, PULARGE_INTEGER); void main (int argc, char **argv) { BOOL fResult; char *pszDrive = NULL, szDrive[4]; DWORD dwSectPerClust, dwBytesPerSect, dwFreeClusters, dwTotalClusters; ...

When to use Tcl_DecrRefCount() and when not to?
I am using SWIG to interface a C library (libgpgme) that uses callbacks to Tcl and have written some C code to implement Tcl coded callbacks. The SWIG interface code looks like this: %{ /* C Structure containing the Tcl_Interp and the script prefix. */ typedef struct { Tcl_Interp *theinterp; Tcl_Obj *thecommand; } tcl_callback_hook; /* Actual C callback, that evals a Tcl script made from the stored script prefix, with the callback parameters uid_hint, passphrase_info, and prev_was_bad lappended. The result of this evaluation is written to the file descr...

Making live action tutorial
How are those tutorials made where it's like a video and you see the screen and the movement of the mouse and the changes in real time? They seem to be done with some sort of dynamic (rather than still) screen capture program. I'd like to share certain techniques with a relative. I'm currently doing several screen shots in sequence, but a live series would be easier to follow. -- Tony Cooper Orlando, FL You need special software. For $299: http://www.techsmith.com/camtasia.asp For free: http://www.camstudio.org/ Hope this helps, Cindy "tony cooper" <tony_c...

Free program in exchange for some mathematics from You
I need the mathematic of the 3DS Max component known as the Nurbs Point Curve. Or should I say I need the interpolated Nurbs curve of degree n. I know I have to iterate 3 or more times the equation but I do not know how and what. I am trying to implement the code of the Nurbs Point Curve to my own program. When it'll be ready, it'll will very fast count the Curves and surfaces on them. Thank You for Your help, in exchange for your invested work I am letting You know about the program I am writing, and some extra basics I did in the past: http://solair.eunet.yu/~yyy/index.htm h...

i want make friends
hello.everyone,i am an chinese.i study in China University of Geosiences,as i like english and C++(program) very much,i also want to make friends with a foriger to improve my english and the ability of program.so i hope there are some person who want to make friends with me,if you want to email me,please sent emails to goosen108@gmail. I know this is a community of C++,and i like C++ very much too.So i want to make friends with a foriger to improve my english and <the ability of program>.thank you ! ...

using coherence function
Hi All, I'm having some trouble understanding how to use the "mscohere" command to get my coherence function. I have a bunch of force and corresponding acceleration signals which I have measured at one point in my test structure. I know I have to use the average of all of these in the coherence function. From the "mscohere" command in matlab, I see that the inputs are x and y, which in my case would be force and acc. however this just allows me to look at one measurement, not the average of all the measurements. If I were to just use 1 measurement, the coherence funct...

Need some help with pre-processign functions in C using Doxygen ...
Hi, I'm currently documenting source code in C and would like to skip some functions that don't need to be documented. The syntax that I'm using is defined below. #ifndef SKIP /* Code to be skipped */ #endif SKIP Where SKIP= Predifined Tag Name I'd like to know if anyone on this list has been using Doxygen to pre-process code fragments. In case you do pre-process the code, I'd appreciate your inputs with an alternative method to ignore code fragments. The problem is that Doxygen compiles it but the C compiler reads the /* Code to be skipped */ as comments !!! Plea...

Superfluous instructions in kernel binaries?
Hello I've noticed that a lot of Windows functions that run in kernel-mode start off with a no-op instruction: MOV EDI,EDI For example, try this in WinDbg: 1: kd> uf iocalldriver nt!IoCallDriver: 804f02f2 8bff mov edi,edi 804f02f4 55 push ebp 804f02f5 8bec mov ebp,esp 804f02f7 8b550c mov edx,[ebp+0xc] 804f02fa 8b4d08 mov ecx,[ebp+0x8] 804f02fd ff1580475580 call dword ptr [nt!pIofCallDriver (80554780)] 804f0303 5d pop ebp 804f0304 c20800 ret 0x8 Th...

using setTimeout when using prototype
I have an object, and I define the following: var processForm=new Object(); processForm=function(inservleturl) { this.inservleturl = inservleturl; this.submitForm(); } processForm.prototype.submitForm2=function() { } processForm.prototype.submitForm=function() { setTimeout("submitStep2()", 20); } How can the submitForm function's setTimeout call submitStep2? Thank you. processForm=function(inservleturl) { this.inservleturl = inservleturl; this.submitForm(); } processForm.prototype.submitForm2=function() { ...

OT: What filtering does Hotmail use?
This is completely off-topic, but I hope someone here knows the answer.=20 When talking to one of my sisters today I asked about an invitation I = sent her last week to her hotmail.com address. She said that she did not = receive it. One of my other sisters did not respond to the same email and it = turns out that she also uses hotmail.com. Another friend who uses hotmail did = not respond to a recent email as I expected them to. I created my own hotmail account tonight and sent an email to it from my domain more than two hours ago, that email has not yet arrived in my = hotmail account. My ...

Re: Scope of macro variable using "call symput" #12
Perahps the way SAS calculates the scope of a macro variable created by CALL SYMPUT violates the principle of least astonishment. It might make more sense for CALL SYMPUT to create a new macro-level symbol table, rather than searching upwards to find one that already exists. Or maybe not. The SYMPUTX way, which lets you specify the scope, is slightly more work to use but can create results that don't require thought or knowledge of context. On Mon, 24 Mar 2008 09:48:11 -0400, "Chang Chung" <chang_y_chung@HOTMAIL.COM> said: > On Sun, 23 Mar 2008 02:05:16 -0700, Rola...

Cannot keep more than one program open on screen at a time
Cannot keep more than one program open on screen at a time. This happened all of a sudden. Always kept 2 - 3 programs on screen In article <5bbe891e-c97e-49ae-ae74-c1698b9cb6a2@x25g2000prf.googlegroups.com>, researcher <maliburon64@gmail.com> wrote: > Cannot keep more than one program open on screen at a time. This > happened all of a sudden. Always kept 2 - 3 programs on screen My guess is that you've got the Finder set to hide all but the active application. In the Finder (click the desktop to make it active), select "Show All" under the Finder men...

What Rbls are you using
Hello I would like to know which DNS RBLs everyone is using and what do you find most effective. I just started using dsn.rfc-ignorant.org and list.dsbl.org. So far no complaints. I have also been using bl.spamcop.net and sbl.spamhaus.org for awhile. They're ok however some get through. Thanks, J -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Fri, 13 Jan 2006 20:39:21 -0800, jpecki wrote: > Hello > I would like to know which DNS RBLs everyone is using and what do you > find most effective. I just started using dsn.rfc-ignorant.org and > list.dsbl.org. So far no compl...

voive recognition using matlab
good day, i would like to ask a program/code using MatLab for my project voice recognition for my subject... thank you very much....... ...

Is it possible to use an Object as value for a <html:select> (i'm using struts)
Hi, I'm having trouble with struts, well actually i don't know if the problem is struts or instead i'm assuming html to be capable of something it's not. I have a ValidatorForm, say VF and one of it's attibutes called organismo is an object of type Ob1, Ob1 being a bean definded in another file, Ob1 has just two attributes a and b and their respective setters and getters. I want that attribute to be filled with the value of a <html:select>, depending on what the user chooses from the options available. To get the options i'm using the next: <html:select pro...

using ethereal
I have Fedora Core 3. When I did my install I did not install X windows or any GUI. I also did not install ethereal. Now I would like to use ethereal, so I installed Xwindows and KDE. I used yum install ethereal and it showed it installed it. But I cannot figure out how to run ethereal. I cannot find any binary to run. I found one post that said I should have an ethereal file under /usr/X11R6/bin but I don't. thank you jb Jason Benway wrote: > I have Fedora Core 3. When I did my install I did not install X windows or > any GUI. I also did not install ethereal. >...

How to get Service Ticket when we have TGT using java.
Hi, We are implementing WS Kerberos authentication using java. We went throught the sample providied @ http://services.iic.ac.in/kbase/docs/java2/guide/security/jgss/tutorials/AcnOnly.html We have setup Fedora Core 3 and krb5-server and configured. When we run the sample code it gets authenticated successfully, but my question is : How to get Kerberos Service Ticket after logging in succesfully and getting the TGT using java 1.4.2_04 and Fedora Core 3 [krb5-server] LoginContext lc ... ... lc.login(); after this step, we can say lc.getSubject() which contains the kerberos ticket, now usin...

Using a DLL from a Lotusscript agent
Hello, I have a LotusScript web agent that uses a external DLL. The agent works fine, but I have a performance problem. The DLL has 2 functions: initialize and translate. The first one performs some initialization tasks and the second one translates a word from a language to another. The problem is that the initialization task takes some time and it seems once the script ends, the DLL loses its iniatilization and I have to initialize it again, slowing down the whole process. Is there any way to keep the dll initialized once the "initialize" function is called and the ...

Using Oracle 8.1.7 OCI JDBC Driver WIth Oracle 9i Database
Has anyone had any problems using the 8.1.7 OCI JDBC driver with a 9i database? Oracle states they are compatible at the following URL: http://otn.oracle.com/tech/java/sqlj_jdbc/htdocs/jdbc_faq.htm#_1_ "Matt" <mdavey4@csc.com> wrote in message news:f2892d3a.0307241304.6295d05c@posting.google.com... > Has anyone had any problems using the 8.1.7 OCI JDBC driver with a 9i database? > > Oracle states they are compatible at the following URL: > http://otn.oracle.com/tech/java/sqlj_jdbc/htdocs/jdbc_faq.htm#_1_ Any reason not to trust that statement? If ...

How to make $100,000 in 20-90 days
HOW TO MAKE 100000 DOLLARS IN 20 -90 DAYS!=20 Send $1.00 to each of the 6 names and addresses stated in the article. You = then place your own name and address in the bottom of the list at #6, and p= ost the article in at least 200 newsgroups. No catch, that was it. So after= thinking it over, and talking to a few people first, I thought about tryin= g it. I figured: "what have I got to lose except 6 stamps and $6.00, right?= " Then I invested the measly $6.00. Well GUESS WHAT!?... within 7 days, I s= tarted getting money in the mail! I was shocked! I figured it would end soo...

Re: Who is using my hard disk...?
Have you try the MSinfo.exe or Msinfo32.exe. It comes with Windows 95 and higher. Joseph -----Original Message----- From: Stratosfear [mailto:stratoulis@yahoo.com] Posted At: Friday, September 12, 2003 9:01 AM Posted To: misc Conversation: Who is using my hard disk...? Subject: Who is using my hard disk...? Hello, Is there any way to find out why, once in a while, there is heavy disk activity on my system. that last for an hour or so...? It is very possible that I have setup a scheduled activity, like backup, or virus scan, but I have checked all my installed apps and still do not know w...