Hi, I am new to Python, and I believe it's an easy question. I know R and Matlab. ************ >>> x=[1,2,3,4,5,6,7] >>> x[0] 1 >>> x[1:5] [2, 3, 4, 5] ************* My question is: what does x[1:5] mean? By Python's convention, the first element of a list is indexed as "0". Doesn't x[1:5] mean a sub-list of x, indexed 1,2,3,4,5? If I am right, it should print [2,3,4,5,6]. Why does it print only [2,3,4,5]? Thanks!! John

0 |

12/17/2016 7:10:22 PM

On Sat, Dec 17, 2016 at 1:10 PM, John <miaojpm@gmail.com> wrote: > > Hi, > > I am new to Python, and I believe it's an easy question. I know R and Matlab. > > ************ > >>> x=[1,2,3,4,5,6,7] > >>> x[0] > 1 > >>> x[1:5] > [2, 3, 4, 5] > ************* > > My question is: what does x[1:5] mean? By Python's convention, the first element of a list is indexed as "0". Doesn't x[1:5] mean a sub-list of x, indexed 1,2,3,4,5? If I am right, it should print [2,3,4,5,6]. Why does it print only [2,3,4,5]? > What you are asking about is "slicing". x[1:5] returns everything between index 1 through, but NOT including, index 5. See https://docs.python.org/3/tutorial/introduction.html#strings which will give examples using strings. A bit later the tutorial addresses slicing in a list context. BTW, the Python Tutorial is well worth reading in full! -- boB

0 |

12/17/2016 7:24:17 PM

John wrote: > Hi, > > I am new to Python, and I believe it's an easy question. I know R and > Matlab. > > ************ >>>> x=[1,2,3,4,5,6,7] >>>> x[0] > 1 >>>> x[1:5] > [2, 3, 4, 5] > ************* > > My question is: what does x[1:5] mean? By Python's convention, the > first element of a list is indexed as "0". Doesn't x[1:5] mean a > sub-list of x, indexed 1,2,3,4,5? If I am right, it should print > [2,3,4,5,6]. Why does it print only [2,3,4,5]? Python uses half-open intervals, i. e. the first index is included, but the last index is not: x[1:5] == [x[1], x[2], x[3], x[4]] The advantage of this convention is that it allows easy splitting and length spefication. >>> items [0, 10, 20, 30, 40, 50, 60, 70, 80, 90] To swap the head and tail: >>> gap = 5 >>> items[gap:] + items[:gap] [50, 60, 70, 80, 90, 0, 10, 20, 30, 40] To extract a stride of a given length: >>> start = 2 >>> length = 3 >>> items[start: start + length] [20, 30, 40] The disadvantage is that not everybody follows this convention...

0 |

12/17/2016 8:00:05 PM

On 12/17/2016 2:10 PM, John wrote: > Hi, > > I am new to Python, and I believe it's an easy question. I know R and Matlab. > > ************ >>>> x=[1,2,3,4,5,6,7] >>>> x[0] > 1 >>>> x[1:5] > [2, 3, 4, 5] > ************* > > My question is: what does x[1:5] mean? The subsequence between slice positions 1 and 5, length 5-1=4. Slice positions are before and after each item, not through them. There are n+1 slice positions for n items: 0 before the first, 1 to n-1 between pairs, and n after the last. Think of slice positions as tick marks on a line with the length 1 segment between as a cell holding a reference to one item. a b c d e +-+-+-+-+-+ 0 1 2 3 4 5 Slice 1:4 of length 3 is sequence with b, c, d. Slice 3:3 of length 0 is an empty sequence. > By Python's convention, the first element of a list is indexed as "0". Think of 0 as .5 rounded down, or represent by the lower bound. Other language round up to 1, or use the upper bound. > Doesn't x[1:5] mean a sub-list of x, indexed 1,2,3,4,5? No. It is items between 1:2, 2:3, 3:4, 4:5. Terry Jan Reedy

0 |

12/17/2016 10:18:31 PM

On 17/12/2016 19:10, John wrote: > Hi, > > I am new to Python, and I believe it's an easy question. I know R and Matlab. > > ************ >>>> x=[1,2,3,4,5,6,7] >>>> x[0] > 1 >>>> x[1:5] > [2, 3, 4, 5] > ************* > > My question is: what does x[1:5] mean? x[A:B] means the slice consisting of x[A], x[A+1],... x[B-1]. (Although slices can shorter including those with be 0 or 1 elements.) > By Python's convention, the first element of a list is indexed as "0". Or the slice from the (A+1)th element to the B'th element inclusive, if you are informally using ordinal indexing (first, second, third etc). > Doesn't x[1:5] mean a sub-list of x, indexed 1,2,3,4,5? Sublists and slices, once extracted, are indexed from 0 too. Play around with some test code, but avoid test data containing numbers that are not too different from possible indices as that will be confusing! Strings might be better: x = "ABCDEFGHIJKLM" print (x[1:5]) displays: BCDE print (x[1:5][0:2]) # slice of a slice displays: BC -- Bartc

0 |

12/17/2016 10:53:43 PM

On Sat, 17 Dec 2016 11:10:22 -0800, John wrote: > Hi, > > I am new to Python, and I believe it's an easy question. I know R and > Matlab. > > ************ >>>> x=[1,2,3,4,5,6,7] >>>> x[0] > 1 >>>> x[1:5] > [2, 3, 4, 5] ************* > > My question is: what does x[1:5] mean? By Python's convention, the > first element of a list is indexed as "0". Doesn't x[1:5] mean a > sub-list of x, indexed 1,2,3,4,5? If I am right, it should print > [2,3,4,5,6]. Why does it print only [2,3,4,5]? > > Thanks!! > > John as well as al the other excellent & detailed explanations think of the slice working on the gaps between the elements (the ',') & not the element itself -- In the long run we are all dead. -- John Maynard Keynes

0 |

12/18/2016 8:12:37 AM

Hi John, there is a nice short article by E. W. Dijkstra about why it makes sense to start numbering at zero (and exclude the upper given bound) while slicing a list. Might give a bit of additional understanding. http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF - paul http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF Am 17.12.2016 um 20:10 schrieb John: > Hi,=20 > > I am new to Python, and I believe it's an easy question. I know R an= d Matlab. > > ************ >>>> x=3D[1,2,3,4,5,6,7] >>>> x[0] > 1 >>>> x[1:5] > [2, 3, 4, 5] > ************* > > My question is: what does x[1:5] mean? By Python's convention, the = first element of a list is indexed as "0". Doesn't x[1:5] mean a sub-list= of x, indexed 1,2,3,4,5? If I am right, it should print [2,3,4,5,6]. Why= does it print only [2,3,4,5]? > > Thanks!! > > John

0 |

12/18/2016 10:59:54 AM

On 18/12/2016 10:59, Paul G�tze wrote: > Hi John, > > there is a nice short article by E. W. Dijkstra about why it makes sense > to start numbering at zero (and exclude the upper given bound) while > slicing a list. Might give a bit of additional understanding. > > http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF (This from somebody who apparently can't use a typewriter?!) I don't know if the arguments there are that convincing. Both lower bounds of 0 and 1 are useful; some languages will use 0, some 1, and some can have any lower bound. But a strong argument for using 1 is that in real life things are usually counted from 1 (and measured from 0). So if you wanted a simple list giving the titles of the chapters in a book or on a DVD, on the colour of the front doors for each house in a street, usually you wouldn't be able to use element 0. As for slice notation, I tend to informally use (not for any particulr language) A..B for an inclusive range, and A:N for a range of length N starting from A. In Python you can also have a third operand for a range, A:B:C, which can mean that B is not necessarily one past the last in the range, and that the A <= i < B condition in that paper is no longer quite true. In fact, A:B:-1 corresponds to A >= i > B, which I think is the same as condition (b) in the paper (but backwards), rather (a) which is favoured. Another little anomaly in Python is that when negative indices are used, it suddenly switches to 1-based indexing! Or least, when -index is considered: x = [-4,-3,-2,-1] print x[-1] # -1 Notice the correspondence here... print x[-2] # -2 x = [1, 2, 3, 4] print x[1] # 2 ...and the lack of it here print x[2] # 3 -- Bartc

0 |

12/18/2016 4:21:20 PM

On Sun, 18 Dec 2016 16:21:20 +0000, BartC wrote: > On 18/12/2016 10:59, Paul Götze wrote: >> Hi John, >> >> there is a nice short article by E. W. Dijkstra about why it makes >> sense to start numbering at zero (and exclude the upper given bound) >> while slicing a list. Might give a bit of additional understanding. >> >> http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF > > (This from somebody who apparently can't use a typewriter?!) > > I don't know if the arguments there are that convincing. Both lower > bounds of 0 and 1 are useful; some languages will use 0, some 1, and > some can have any lower bound. > > But a strong argument for using 1 is that in real life things are > usually counted from 1 (and measured from 0). > > So if you wanted a simple list giving the titles of the chapters in a > book or on a DVD, on the colour of the front doors for each house in a > street, usually you wouldn't be able to use element 0. > > As for slice notation, I tend to informally use (not for any particulr > language) A..B for an inclusive range, and A:N for a range of length N > starting from A. > > In Python you can also have a third operand for a range, A:B:C, which > can mean that B is not necessarily one past the last in the range, and > that the A <= i < B condition in that paper is no longer quite true. > > In fact, A:B:-1 corresponds to A >= i > B, which I think is the same as > condition (b) in the paper (but backwards), rather (a) which is > favoured. > > Another little anomaly in Python is that when negative indices are used, > it suddenly switches to 1-based indexing! Or least, when -index is > considered: > > x = [-4,-3,-2,-1] > > print x[-1] # -1 Notice the correspondence here... > print x[-2] # -2 > > x = [1, 2, 3, 4] > > print x[1] # 2 ...and the lack of it here print x[2] > # 3 as I said earlier take the indicates as being the spaces between the elements & it makes much more sense -- falsie salesman, n: Fuller bust man.

0 |

12/18/2016 8:44:36 PM

On 12/18/2016 09:21 AM, BartC wrote: > On 18/12/2016 10:59, Paul G�tze wrote: >> Hi John, >> >> there is a nice short article by E. W. Dijkstra about why it makes sense >> to start numbering at zero (and exclude the upper given bound) while >> slicing a list. Might give a bit of additional understanding. >> >> http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF > > (This from somebody who apparently can't use a typewriter?!) > > I don't know if the arguments there are that convincing. Both lower > bounds of 0 and 1 are useful; some languages will use 0, some 1, and > some can have any lower bound. > > But a strong argument for using 1 is that in real life things are > usually counted from 1 (and measured from 0). > > So if you wanted a simple list giving the titles of the chapters in a > book or on a DVD, on the colour of the front doors for each house in a > street, usually you wouldn't be able to use element 0. It also depends on whether you want to number the spaces between the objects or the objects themselves. To use your DVD example, the first chapter will probably be starting at time zero, not time one. In another example, babies start out at "zero" years old not "one." But at the same time we refer the first year of life. Maybe it's not a phrase much used these days but it used to be common to say something like "in my 15th year," meaning when I was 14. Maybe a more common use would be "the first year of my employment at this company." I'm not sure it makes sense to having slicing be zero-based but indexing itself be 1-based, but I think a case could have been made (though I'm glad it was not).

0 |

12/18/2016 9:04:56 PM

On 18Dec2016 16:21, BartC <bc@freeuk.com> wrote: >On 18/12/2016 10:59, Paul G=C3=B6tze wrote: >>there is a nice short article by E. W. Dijkstra about why it makes sense >>to start numbering at zero (and exclude the upper given bound) while >>slicing a list. Might give a bit of additional understanding. >> >>http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF > >(This from somebody who apparently can't use a typewriter?!) > >I don't know if the arguments there are that convincing. Both lower=20 >bounds of 0 and 1 are useful; some languages will use 0, some 1, and=20 >some can have any lower bound. 0 makes a lot of arithmetic simpler if you think of the index as the offset= =20 =66rom the start of the array/list. >But a strong argument for using 1 is that in real life things are=20 >usually counted from 1 (and measured from 0). Shrug. Yep. But again, if you visualise the index as an offset (=3D=3D "mea= sure")=20 it is a natural fit. >Another little anomaly in Python is that when negative indices are used, i= t=20 >suddenly switches to 1-based indexing! Or least, when -index is considered: Not if you consider it to count from the range end. So range 0:5 (which in= =20 Python includes indices 0,1,2,3,4); index -1 places you at 5-1 =3D=3D> 4, w= hich is=20 consistent. Again, this makes a lot of the arithmetic simpler. See sig quote for another example of a python style range: birth to death. Cheers, --=20 Cameron Simpson <cs@zip.com.au> There's no need to worry about death, it will not happen in your lifetime. - Raymond Smullyan

0 |

12/18/2016 9:56:16 PM

On 18/12/2016 21:04, Michael Torrie wrote: > On 12/18/2016 09:21 AM, BartC wrote: >> So if you wanted a simple list giving the titles of the chapters in a >> book or on a DVD, on the colour of the front doors for each house in a >> street, usually you wouldn't be able to use element 0. > > It also depends on whether you want to number the spaces between the > objects or the objects themselves. To use your DVD example, the first > chapter will probably be starting at time zero, not time one. > > In another example, babies start out at "zero" years old not "one." But > at the same time we refer the first year of life. Maybe it's not a > phrase much used these days but it used to be common to say something > like "in my 15th year," meaning when I was 14. Maybe a more common use > would be "the first year of my employment at this company." There's the fence analogy (perhaps similar to what alister said): You have a fence made up of one-metre-wide panels that fit between two posts. For a 10-metre fence, you need 11 posts, and 10 panels. The posts can conveniently be numbered from 0 to 11, as that also gives you the distance of each one from the start of the fence. But posts are thin. Panels are wide, and they might as well be conventionally numbered from 1, as you can't use the panel number to tell you how far it is from the start (is it the left bit of the panel or the right bit?). Panels obviously correspond to the data in each list element; posts are harder to place, except perhaps as alister's commas (but then there have to be extra invisible commas at each end). (The fence needs to divide an open area not surround an enclosed space, as otherwise the analogy breaks down; you will have N panels and N posts!) > I'm not sure it makes sense to having slicing be zero-based but indexing > itself be 1-based, but I think a case could have been made (though I'm > glad it was not). They need to be the same. (Zero-based necessarily has to be used with offsets from pointers from example. In C, array indexing is inextricably tied up with pointer/offset arithmetic, so indexing /has/ to be zero-based. But that doesn't apply in other languages where the choice could have been different.) -- Bartc

0 |

12/18/2016 10:21:01 PM

On 18/12/2016 22:21, BartC wrote: > On 18/12/2016 21:04, Michael Torrie wrote: >> On 12/18/2016 09:21 AM, BartC wrote: > >>> So if you wanted a simple list giving the titles of the chapters in a >>> book or on a DVD, on the colour of the front doors for each house in a >>> street, usually you wouldn't be able to use element 0. >> >> It also depends on whether you want to number the spaces between the >> objects or the objects themselves. To use your DVD example, the first >> chapter will probably be starting at time zero, not time one. >> >> In another example, babies start out at "zero" years old not "one." But >> at the same time we refer the first year of life. Maybe it's not a >> phrase much used these days but it used to be common to say something >> like "in my 15th year," meaning when I was 14. Maybe a more common use >> would be "the first year of my employment at this company." > > There's the fence analogy (perhaps similar to what alister said): > > You have a fence made up of one-metre-wide panels that fit between two > posts. > > For a 10-metre fence, you need 11 posts, and 10 panels. > > The posts can conveniently be numbered from 0 to 11, .... 0 to 10. That's the thing with zero-based; it might reduce some off-by-one errors but could introduce others. With the panels you have 10 panels numbered 1 to 10; what could be simpler or more intuitive? -- Bartc

0 |

12/18/2016 10:36:00 PM

BartC <bc@freeuk.com> writes: > On 18/12/2016 10:59, Paul G�tze wrote: >> there is a nice short article by E. W. Dijkstra about why it makes sense >> to start numbering at zero (and exclude the upper given bound) while >> slicing a list. Might give a bit of additional understanding. >> >> http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF > > (This from somebody who apparently can't use a typewriter?!) > > I don't know if the arguments there are that convincing. Both lower > bounds of 0 and 1 are useful; some languages will use 0, some 1, and > some can have any lower bound. > > But a strong argument for using 1 is that in real life things are > usually counted from 1 (and measured from 0). The index of an element is a measure, not a count. <snip> -- Ben.

0 |

12/19/2016 1:10:55 AM

On 19/12/2016 01:10, Ben Bacarisse wrote: > BartC <bc@freeuk.com> writes: > >> On 18/12/2016 10:59, Paul G�tze wrote: >>> there is a nice short article by E. W. Dijkstra about why it makes sense >>> to start numbering at zero (and exclude the upper given bound) while >>> slicing a list. Might give a bit of additional understanding. >>> >>> http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF >> >> (This from somebody who apparently can't use a typewriter?!) >> >> I don't know if the arguments there are that convincing. Both lower >> bounds of 0 and 1 are useful; some languages will use 0, some 1, and >> some can have any lower bound. >> >> But a strong argument for using 1 is that in real life things are >> usually counted from 1 (and measured from 0). > > The index of an element is a measure, not a count. You need to take your C hat off, I think. -- Bartc

0 |

12/19/2016 10:26:56 AM

On Sunday, December 18, 2016 at 11:21:38 AM UTC-5, BartC wrote: > On 18/12/2016 10:59, Paul G=C3=B6tze wrote: > > Hi John, > > > > there is a nice short article by E. W. Dijkstra about why it makes sens= e > > to start numbering at zero (and exclude the upper given bound) while > > slicing a list. Might give a bit of additional understanding. > > > > http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF >=20 > (This from somebody who apparently can't use a typewriter?!) Don't judge a book by its rendering technology :) Dijkstra was "one of the most influential members of computing science's founding generation": https://en.wikipedia.org/wiki/Edsger_W._Dijkstra --Ned.

0 |

12/19/2016 12:32:35 PM

BartC <bc@freeuk.com> writes: > On 19/12/2016 01:10, Ben Bacarisse wrote: >> BartC <bc@freeuk.com> writes: >> >>> On 18/12/2016 10:59, Paul G�tze wrote: >>>> there is a nice short article by E. W. Dijkstra about why it makes sense >>>> to start numbering at zero (and exclude the upper given bound) while >>>> slicing a list. Might give a bit of additional understanding. >>>> >>>> http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF >>> >>> (This from somebody who apparently can't use a typewriter?!) >>> >>> I don't know if the arguments there are that convincing. Both lower >>> bounds of 0 and 1 are useful; some languages will use 0, some 1, and >>> some can have any lower bound. >>> >>> But a strong argument for using 1 is that in real life things are >>> usually counted from 1 (and measured from 0). >> >> The index of an element is a measure, not a count. > > You need to take your C hat off, I think. It's a computing hat. Indexes are best seen as offsets (i.e. as a measured distances from some origin or base). It's a model that grew out of machine addressing and assembler address modes many, many decades ago -- long before C. C, being a low-level language, obviously borrowed it, but pretty much all the well-thought out high-level languages have seen the value in it too, though I'd be interested in hearing about counter examples. The main issue -- of using a half open interval for a range -- is probably less widely agreed upon, though I think it should be. EWD is correct about this (as about so many things). -- Ben.

0 |

12/19/2016 1:48:01 PM

Ben Bacarisse writes: > BartC writes: > >> You need to take your C hat off, I think. > > It's a computing hat. Indexes are best seen as offsets (i.e. as a > measured distances from some origin or base). It's a model that grew > out of machine addressing and assembler address modes many, many > decades ago -- long before C. C, being a low-level language, > obviously borrowed it, but pretty much all the well-thought out > high-level languages have seen the value in it too, though I'd be > interested in hearing about counter examples. Julia, at version 0.5 of the language, is a major counter-example: 1-based, closed ranges. I think they have been much influenced by the mathematical practice in linear algebra, possibly through another computing language. I think there's some work going on to allow other starting points, or at least 0. Not sure about half-open ranges. > The main issue -- of using a half open interval for a range -- is > probably less widely agreed upon, though I think it should be. EWD is > correct about this (as about so many things). Agreed. (I even use pen and paper, though I don't always remember what I wrote.)

0 |

12/19/2016 2:06:40 PM

On 19/12/2016 13:48, Ben Bacarisse wrote: > BartC <bc@freeuk.com> writes: >> You need to take your C hat off, I think. > > It's a computing hat. Indexes are best seen as offsets (i.e. as a > measured distances from some origin or base). A 1-based or N-based index can still be seen as an offset from element 0, if that is what you want, although it might be an imaginary element. But if you needed a table of the frequencies of letters A to Z, indexed by the ASCII codes of those letters, then with 0-based you will either waste a lot of entries (table[0] to table[64]), or need to build extra offsets into the code (table[letter-ord('A')]), or have to use a clumsier, less efficient table (ordered dict or whatever). An N-based array can simply have bounds of ord('A') to ord('Z') inclusive. And the array would have just 26 elements. It's a model that grew > out of machine addressing and assembler address modes many, many decades > ago -- long before C. C, being a low-level language, obviously borrowed > it, but pretty much all the well-thought out high-level languages have > seen the value in it too, though I'd be interested in hearing about > counter examples. Older languages tend to use 1-based indexing. Newer ones zero, but it's possible C has been an influence if that is what designers and implementers have been exposed to. Among the 1-based or N-based languages are Fortran, Algol68 and Ada: https://en.wikipedia.org/wiki/Comparison_of_programming_languages_(array) (I'd only known 1- or N-based ones so the languages I've created have all been N-based with a default of 1. With exceptions for certain types, eg: strings are 1-based only; bit-indexing within machine words is 0-based.) However, if the base for arrays is fixed at 0 or 1, then 0-based is probably more flexible, as you can emulated 1-based indexing by adding an extra element and ignoring element 0; it's a little harder the other way around! > The main issue -- of using a half open interval for a range -- is > probably less widely agreed upon, though I think it should be. EWD is > correct about this (as about so many things). With 0-based lists, then half-open intervals are more suited. Otherwise a list of N elements would be indexed by a range of [0:N-1]. While 1-based is more suited to closed intervals: [1:N]. But then, suppose you also had an alternate syntax for a range, of [A^^N], which starts at A and has a length of N elements. Then ranges of [1^^N] and [0^^N] are both specified the same way. You don't need to worry about open or closed intervals. (I've seen two arrows ^^ used on road signs to denote the length of an impending tunnel rather than the distance to it.) -- Bartc

0 |

12/19/2016 3:48:05 PM

Jussi Piitulainen <jussi.piitulainen@helsinki.fi> writes: > Ben Bacarisse writes: > >> BartC writes: >> >>> You need to take your C hat off, I think. >> >> It's a computing hat. Indexes are best seen as offsets (i.e. as a >> measured distances from some origin or base). It's a model that grew >> out of machine addressing and assembler address modes many, many >> decades ago -- long before C. C, being a low-level language, >> obviously borrowed it, but pretty much all the well-thought out >> high-level languages have seen the value in it too, though I'd be >> interested in hearing about counter examples. > > Julia, at version 0.5 of the language, is a major counter-example: > 1-based, closed ranges. I think they have been much influenced by the > mathematical practice in linear algebra, possibly through another > computing language. Interesting. Thanks. <snip> -- Ben.

0 |

12/19/2016 8:44:39 PM

BartC wrote: > But if you needed a table of the frequencies of letters A to Z... > An N-based array can simply have bounds of ord('A') to ord('Z') > inclusive. That's fine if your language lets you have arrays with arbitrary lower bounds. But if the language only allows a fixed lower bound, and furthermore insists that the lower bound be 1, then your indexing expression becomes: table[ord(letter) - ord('A') + 1] If a language is only going to allow me a single lower bound, I'd rather it be 0, because I can easily shift that by whatever offset I need. But if it's 1, often I need to cancel out an unwanted 1 first, leading to code that's harder to reason about. -- Greg

0 |

12/19/2016 9:43:35 PM

On Mon, 19 Dec 2016 03:21 am, BartC wrote: > On 18/12/2016 10:59, Paul Götze wrote: >> Hi John, >> >> there is a nice short article by E. W. Dijkstra about why it makes sense >> to start numbering at zero (and exclude the upper given bound) while >> slicing a list. Might give a bit of additional understanding. >> >> http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF > > (This from somebody who apparently can't use a typewriter?!) Oh Bart don't be so hard on yourself, I don't think it matters whether or not you can use a typewriter, your comments would be as equally interesting whether you used a typewriter, spoke them allowed and had them transcribed by voice recognition software, or wrote them out by hand. (Handwriting is a lost skill in this day and age of people poking at their JesusPhones with their thumb, typing out txt sp3k 1 ltr at a time.) > I don't know if the arguments there are that convincing. Both lower > bounds of 0 and 1 are useful; Dijkstra doesn't deny that other bounds can be useful. The unstated assumption in his argument is you can only pick one system for indexing into arrays. But of course that's not strictly correct -- different languages can and do choose different schemes, or even choose different schemes for each language. > some languages will use 0, some 1, and > some can have any lower bound. > > But a strong argument for using 1 is that in real life things are > usually counted from 1 (and measured from 0). I'm sure that when European merchants first learned of this new-fangled "zero" from their Arabic and Indian colleagues in the Middle Ages, they probably had a quite similar reaction. After all, didn't the Greeks themselves claim that the smallest number was two? Zero is clearly not a number, it is the absence of number, and 1 is not a number, it is unity. http://math.furman.edu/~mwoodard/fuejum/content/2012/paper1_2012.pdf You do make a point that even children (at least those about the age of three or four) can understand 1-based indexing, but I don't think it is a *critical* point. Most programmers have learned a bit more than the average four year old. If you can understand threads, OOP design patterns, binary search or C pointers, counting indexes 0, 1, 2, 3, ... should hold no fears for you. > So if you wanted a simple list giving the titles of the chapters in a > book or on a DVD, on the colour of the front doors for each house in a > street, usually you wouldn't be able to use element 0. > > As for slice notation, I tend to informally use (not for any particulr > language) A..B for an inclusive range, and A:N for a range of length N > starting from A. An interesting thought... I like the idea of being able to specify slices by index or by length. > In Python you can also have a third operand for a range, A:B:C, which > can mean that B is not necessarily one past the last in the range, and > that the A <= i < B condition in that paper is no longer quite true. You cannot use a simple comparison like A <= i < B to represent a range with a step-size not equal to 1. How would you specify the sequence: 2, 4, 6, 8, 10, 12, 14 as a single expression with ONLY an upper and lower bound? 2 <= i <= 14 clearly doesn't work, and nor does any alternative. Dijkstra doesn't discuss range() with non-unit step sizes. > In fact, A:B:-1 corresponds to A >= i > B, which I think is the same as > condition (b) in the paper (but backwards), rather (a) which is favoured. I have no way of knowing what it corresponds to in your head, but in Python, a slice A:B:-1 corresponds to the elements at indices: A, A-1, A-2, A-3, ..., B+1 *in that order*. py> '0123456789'[7:2:-1] '76543' Since Dijkstra doesn't discuss non-unit step sizes, it is unclear what he would say about counting backwards. > Another little anomaly in Python is that when negative indices are used, > it suddenly switches to 1-based indexing! Or least, when -index is > considered: > > x = [-4,-3,-2,-1] > > print x[-1] # -1 Notice the correspondence here... > print x[-2] # -2 Congratulations, you've discovered that 4-1 equals 3. Well done! > x = [1, 2, 3, 4] > > print x[1] # 2 ...and the lack of it here > print x[2] # 3 That's silly. You can get whichever correspondence you like by choosing values which do or don't match the indices used by Python: x = [-3, -2, -1, 0] print x[-1] # Notice the lack of correspondence x = [0, 1, 2, 3] print x[1] # Notice the correspondence here x = [6, 7, 8, 9] print x[1] # Notice the lack of correspondence here What's your point? I infer that you're having difficulty with Python slicing and indexing because you have a mental model where indexes label the items themselves (best viewed in a fixed width font): +---+---+---+---+---+---+ | P | y | t | h | o | n | +---+---+---+---+---+---+ ...0...1...2...3...4...5 # Indexes ..-6..-5..-4..-3..-2..-1 # 6-1 = 5, 6-2 = 4 etc. That makes 'Python'[1] easy to visualise (its obviously just 'y') but slicing becomes more complex: 'Python'[1:4] => *include* index 1, *exclude* index 4 => 'yth' But a better model is to consider the *boundaries* labelled: +---+---+---+---+---+---+ | P | y | t | h | o | n | +---+---+---+---+---+---+ 0...1...2...3...4...5...6 # Indexes -6..-5..-4..-3..-2..-1 A simple index is a bit more complex (think of it as the element immediately following the boundary, hence 'Python'[1] is still 'y') but slicing now becomes simple. Visualise slicing along the boundaries (that's where the name comes from!) and now it is obvious that the start index is included and the stop index is excluded/ +---+---+---+---+---+---+ | P | y | t | h | o | n | +---+---+---+---+---+---+ .....^...........^ cut here and here There's no mental model that makes slicing with non-zero step sizes easy, *especially* if the step size is a non-unit negative number (say, -3). For those you really have to simulate the process of actually iterating over the indexes one by one. -- Steve “Cheer up,” they said, “things could be worse.” So I cheered up, and sure enough, things got worse.

0 |

12/20/2016 12:49:19 AM

On 20/12/2016 00:49, Steve D'Aprano wrote: > On Mon, 19 Dec 2016 03:21 am, BartC wrote: > >> On 18/12/2016 10:59, Paul Götze wrote: >>> Hi John, >>> >>> there is a nice short article by E. W. Dijkstra about why it makes sense >>> to start numbering at zero (and exclude the upper given bound) while >>> slicing a list. Might give a bit of additional understanding. >>> >>> http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF > If you can understand threads, OOP design patterns, binary search or C > pointers, counting indexes 0, 1, 2, 3, ... should hold no fears for you. Who's talking about me? In the area of scripting languages that could be used by people with a wide range of expertise, I would have expected 1-based to be more common. >> In Python you can also have a third operand for a range, A:B:C, which >> can mean that B is not necessarily one past the last in the range, and >> that the A <= i < B condition in that paper is no longer quite true. > > You cannot use a simple comparison like A <= i < B to represent a range with > a step-size not equal to 1. How would you specify the sequence: > > 2, 4, 6, 8, 10, 12, 14 > > as a single expression with ONLY an upper and lower bound? > 2 <= i <= 14 > > clearly doesn't work, and nor does any alternative. I didn't mean that the expression specifies the range, but that an value in the range satisfies the expression. >> In fact, A:B:-1 corresponds to A >= i > B, which I think is the same as >> condition (b) in the paper (but backwards), rather (a) which is favoured. > > I have no way of knowing what it corresponds to in your head, but in Python, > a slice A:B:-1 corresponds to the elements at indices: I got the idea for a minute that, since A:B:C specifies a set of indices, and that range(A,B,C) specifies the same set of values, that they are interchangeable. But of course you can't use x[range(A,B,C)] and you can't do 'for i in A:B:C'. [In my, ahem, own language, you can just that...] > A, A-1, A-2, A-3, ..., B+1 > > *in that order*. > > py> '0123456789'[7:2:-1] > '76543' When A is 7, B is 2 and C is -1, then for i in range(-1000,1000): if A >=i > B: print (i) displays only the numbers 3,4,5,6,7; the same elements you get, and they satisfy A >= i > B as I said. >> Another little anomaly in Python is that when negative indices are used, >> it suddenly switches to 1-based indexing! Or least, when -index is >> considered: > That's silly. You can get whichever correspondence you like by choosing > values which do or don't match the indices used by Python: > > x = [-3, -2, -1, 0] > print x[-1] # Notice the lack of correspondence > > x = [0, 1, 2, 3] > print x[1] # Notice the correspondence here > > x = [6, 7, 8, 9] > print x[1] # Notice the lack of correspondence here > What's your point? That point is that indices go 0,1,2,3,... when indexed from the start, and -1,-2,-3,... when indexed from the end. That means that the THIRD item from the start has index [2], but the THIRD item from the end has index [-3]. (So if you reversed a list x, and wanted the same x[i] element you had before it was reversed, it now has to be x[-(i+1)]. 1-based, the elements would be x[i] and x[-i], a little more symmetric. But with N-based, you wouldn't be able to index from the end at all.) > +---+---+---+---+---+---+ > | P | y | t | h | o | n | > +---+---+---+---+---+---+ > ....^...........^ > cut here and here That's the fence/fencepost thing I mentioned elsewhere. It comes up also in graphics: if these boxes represent pixels rather than elements, and a function draws a line from pixel 1 to pixel 4 or fills then in, then should pixel 4 be filled in or not? But if you think of these values as continuous measures from the left edge, rather than as discrete units, and denote them as 1.0 to 4.0, then it becomes more obvious. The 3 pixels between 1.0 and 4.0 are filled in. -- bartc

0 |

12/20/2016 11:45:07 AM