f



python list index - an easy question

Hi, 

   I am new to Python, and I believe it's an easy question. I know R and Matlab.

************
>>> x=[1,2,3,4,5,6,7]
>>> x[0]
1
>>> x[1:5]
[2, 3, 4, 5]
*************

    My question is: what does x[1:5] mean? By Python's convention, the first element of a list is indexed as "0". Doesn't x[1:5] mean a sub-list of x, indexed 1,2,3,4,5? If I am right, it should print [2,3,4,5,6]. Why does it print only [2,3,4,5]?

   Thanks!!

John
0
John
12/17/2016 7:10:22 PM
comp.lang.python 77058 articles. 3 followers. Post Follow

22 Replies
56 Views

Similar Articles

[PageSpeed] 5

On Sat, Dec 17, 2016 at 1:10 PM, John <miaojpm@gmail.com> wrote:
>
> Hi,
>
>    I am new to Python, and I believe it's an easy question. I know R and Matlab.
>
> ************
> >>> x=[1,2,3,4,5,6,7]
> >>> x[0]
> 1
> >>> x[1:5]
> [2, 3, 4, 5]
> *************
>
>     My question is: what does x[1:5] mean? By Python's convention, the first element of a list is indexed as "0". Doesn't x[1:5] mean a sub-list of x, indexed 1,2,3,4,5? If I am right, it should print [2,3,4,5,6]. Why does it print only [2,3,4,5]?
>

What you are asking about is "slicing".  x[1:5] returns everything
between index 1 through, but NOT including, index 5.  See

https://docs.python.org/3/tutorial/introduction.html#strings

which will give examples using strings.  A bit later the tutorial
addresses slicing in a list context.

BTW, the Python Tutorial is well worth reading in full!


-- 
boB
0
boB
12/17/2016 7:24:17 PM
John wrote:

> Hi,
> 
>    I am new to Python, and I believe it's an easy question. I know R and
>    Matlab.
> 
> ************
>>>> x=[1,2,3,4,5,6,7]
>>>> x[0]
> 1
>>>> x[1:5]
> [2, 3, 4, 5]
> *************
> 
>     My question is: what does x[1:5] mean? By Python's convention, the
>     first element of a list is indexed as "0". Doesn't x[1:5] mean a
>     sub-list of x, indexed 1,2,3,4,5? If I am right, it should print
>     [2,3,4,5,6]. Why does it print only [2,3,4,5]?

Python uses half-open intervals, i. e. the first index is included, but the 
last index is not:

x[1:5] == [x[1], x[2], x[3], x[4]]

The advantage of this convention is that it allows easy splitting and length 
spefication.

>>> items
[0, 10, 20, 30, 40, 50, 60, 70, 80, 90]

To swap the head and tail:

>>> gap = 5
>>> items[gap:] + items[:gap]
[50, 60, 70, 80, 90, 0, 10, 20, 30, 40]

To extract a stride of a given length:

>>> start = 2
>>> length = 3
>>> items[start: start + length]
[20, 30, 40]

The disadvantage is that not everybody follows this convention...

0
Peter
12/17/2016 8:00:05 PM
On 12/17/2016 2:10 PM, John wrote:
> Hi,
>
>    I am new to Python, and I believe it's an easy question. I know R and Matlab.
>
> ************
>>>> x=[1,2,3,4,5,6,7]
>>>> x[0]
> 1
>>>> x[1:5]
> [2, 3, 4, 5]
> *************
>
>     My question is: what does x[1:5] mean?

The subsequence between slice positions 1 and 5, length 5-1=4.
Slice positions are before and after each item, not through them.
There are n+1 slice positions for n items: 0 before the first,
1 to n-1 between pairs, and n after the last.
Think of slice positions as tick marks on a line with the length 1
segment between as a cell holding a reference to one item.

  a b c d e
+-+-+-+-+-+
0 1 2 3 4 5

Slice 1:4 of length 3 is sequence with b, c, d.
Slice 3:3 of length 0 is an empty sequence.

> By Python's convention, the first element of a list is indexed as "0".

Think of 0 as .5 rounded down, or represent by the lower bound.
Other language round up to 1, or use the upper bound.

> Doesn't x[1:5] mean a sub-list of x, indexed 1,2,3,4,5?

No.  It is items between 1:2, 2:3, 3:4, 4:5.


Terry Jan Reedy

0
Terry
12/17/2016 10:18:31 PM
On 17/12/2016 19:10, John wrote:
> Hi,
>
>    I am new to Python, and I believe it's an easy question. I know R and Matlab.
>
> ************
>>>> x=[1,2,3,4,5,6,7]
>>>> x[0]
> 1
>>>> x[1:5]
> [2, 3, 4, 5]
> *************
>
>     My question is: what does x[1:5] mean?

x[A:B] means the slice consisting of x[A], x[A+1],... x[B-1]. (Although 
slices can shorter including those with be 0 or 1 elements.)

> By Python's convention, the first element of a list is indexed as "0".

Or the slice from the (A+1)th element to the B'th element inclusive, if 
you are informally using ordinal indexing (first, second, third etc).

> Doesn't x[1:5] mean a sub-list of x, indexed 1,2,3,4,5?

Sublists and slices, once extracted, are indexed from 0 too.

Play around with some test code, but avoid test data containing numbers 
that are not too different from possible indices as that will be confusing!

Strings might be better:

   x = "ABCDEFGHIJKLM"

   print (x[1:5])

displays: BCDE

   print (x[1:5][0:2])        # slice of a slice

displays: BC

-- 
Bartc
0
BartC
12/17/2016 10:53:43 PM
On Sat, 17 Dec 2016 11:10:22 -0800, John wrote:

> Hi,
> 
>    I am new to Python, and I believe it's an easy question. I know R and
>    Matlab.
> 
> ************
>>>> x=[1,2,3,4,5,6,7]
>>>> x[0]
> 1
>>>> x[1:5]
> [2, 3, 4, 5] *************
> 
>     My question is: what does x[1:5] mean? By Python's convention, the
>     first element of a list is indexed as "0". Doesn't x[1:5] mean a
>     sub-list of x, indexed 1,2,3,4,5? If I am right, it should print
>     [2,3,4,5,6]. Why does it print only [2,3,4,5]?
> 
>    Thanks!!
> 
> John

as well as al the other excellent & detailed explanations
think of the slice working on the gaps between the elements (the ',') & 
not the element itself




-- 
In the long run we are all dead.
		-- John Maynard Keynes
0
alister
12/18/2016 8:12:37 AM
Hi John,

there is a nice short article by E. W. Dijkstra about why it makes sense
to start numbering at zero (and exclude the upper given bound) while
slicing a list. Might give a bit of additional understanding.

http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF

- paul


http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF


Am 17.12.2016 um 20:10 schrieb John:
> Hi,=20
>
>    I am new to Python, and I believe it's an easy question. I know R an=
d Matlab.
>
> ************
>>>> x=3D[1,2,3,4,5,6,7]
>>>> x[0]
> 1
>>>> x[1:5]
> [2, 3, 4, 5]
> *************
>
>     My question is: what does x[1:5] mean? By Python's convention, the =
first element of a list is indexed as "0". Doesn't x[1:5] mean a sub-list=
 of x, indexed 1,2,3,4,5? If I am right, it should print [2,3,4,5,6]. Why=
 does it print only [2,3,4,5]?
>
>    Thanks!!
>
> John


0
UTF
12/18/2016 10:59:54 AM
On 18/12/2016 10:59, Paul G�tze wrote:
> Hi John,
>
> there is a nice short article by E. W. Dijkstra about why it makes sense
> to start numbering at zero (and exclude the upper given bound) while
> slicing a list. Might give a bit of additional understanding.
>
> http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF

(This from somebody who apparently can't use a typewriter?!)

I don't know if the arguments there are that convincing. Both lower 
bounds of 0 and 1 are useful; some languages will use 0, some 1, and 
some can have any lower bound.

But a strong argument for using 1 is that in real life things are 
usually counted from 1 (and measured from 0).

So if you wanted a simple list giving the titles of the chapters in a 
book or on a DVD, on the colour of the front doors for each house in a 
street, usually you wouldn't be able to use element 0.

As for slice notation, I tend to informally use (not for any particulr 
language) A..B for an inclusive range, and A:N for a range of length N 
starting from A.

In Python you can also have a third operand for a range, A:B:C, which 
can mean that B is not necessarily one past the last in the range, and 
that the A <= i < B condition in that paper is no longer quite true.

In fact, A:B:-1 corresponds to A >= i > B, which I think is the same as 
condition (b) in the paper (but backwards), rather (a) which is favoured.

Another little anomaly in Python is that when negative indices are used, 
it suddenly switches to 1-based indexing! Or least, when -index is 
considered:

   x = [-4,-3,-2,-1]

   print x[-1]       # -1  Notice the correspondence here...
   print x[-2]       # -2

   x = [1, 2, 3, 4]

   print x[1]        # 2   ...and the lack of it here
   print x[2]        # 3


-- 
Bartc

0
BartC
12/18/2016 4:21:20 PM
On Sun, 18 Dec 2016 16:21:20 +0000, BartC wrote:

> On 18/12/2016 10:59, Paul Götze wrote:
>> Hi John,
>>
>> there is a nice short article by E. W. Dijkstra about why it makes
>> sense to start numbering at zero (and exclude the upper given bound)
>> while slicing a list. Might give a bit of additional understanding.
>>
>> http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF
> 
> (This from somebody who apparently can't use a typewriter?!)
> 
> I don't know if the arguments there are that convincing. Both lower
> bounds of 0 and 1 are useful; some languages will use 0, some 1, and
> some can have any lower bound.
> 
> But a strong argument for using 1 is that in real life things are
> usually counted from 1 (and measured from 0).
> 
> So if you wanted a simple list giving the titles of the chapters in a
> book or on a DVD, on the colour of the front doors for each house in a
> street, usually you wouldn't be able to use element 0.
> 
> As for slice notation, I tend to informally use (not for any particulr
> language) A..B for an inclusive range, and A:N for a range of length N
> starting from A.
> 
> In Python you can also have a third operand for a range, A:B:C, which
> can mean that B is not necessarily one past the last in the range, and
> that the A <= i < B condition in that paper is no longer quite true.
> 
> In fact, A:B:-1 corresponds to A >= i > B, which I think is the same as
> condition (b) in the paper (but backwards), rather (a) which is
> favoured.
> 
> Another little anomaly in Python is that when negative indices are used,
> it suddenly switches to 1-based indexing! Or least, when -index is
> considered:
> 
>    x = [-4,-3,-2,-1]
> 
>    print x[-1]       # -1  Notice the correspondence here...
>    print x[-2]       # -2
> 
>    x = [1, 2, 3, 4]
> 
>    print x[1]        # 2   ...and the lack of it here print x[2]       
>    # 3


as I said earlier
take the indicates as being the spaces between the elements & it makes 
much more sense


-- 
falsie salesman, n:
	Fuller bust man.
0
alister
12/18/2016 8:44:36 PM
On 12/18/2016 09:21 AM, BartC wrote:
> On 18/12/2016 10:59, Paul G�tze wrote:
>> Hi John,
>>
>> there is a nice short article by E. W. Dijkstra about why it makes sense
>> to start numbering at zero (and exclude the upper given bound) while
>> slicing a list. Might give a bit of additional understanding.
>>
>> http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF
> 
> (This from somebody who apparently can't use a typewriter?!)
> 
> I don't know if the arguments there are that convincing. Both lower 
> bounds of 0 and 1 are useful; some languages will use 0, some 1, and 
> some can have any lower bound.
> 
> But a strong argument for using 1 is that in real life things are 
> usually counted from 1 (and measured from 0).
> 
> So if you wanted a simple list giving the titles of the chapters in a 
> book or on a DVD, on the colour of the front doors for each house in a 
> street, usually you wouldn't be able to use element 0.

It also depends on whether you want to number the spaces between the
objects or the objects themselves. To use your DVD example, the first
chapter will probably be starting at time zero, not time one.

In another example, babies start out at "zero" years old not "one."  But
at the same time we refer the first year of life.  Maybe it's not a
phrase much used these days but it used to be common to say something
like "in my 15th year," meaning when I was 14.  Maybe a more common use
would be "the first year of my employment at this company."

I'm not sure it makes sense to having slicing be zero-based but indexing
itself be 1-based, but I think a case could have been made (though I'm
glad it was not).
0
Michael
12/18/2016 9:04:56 PM
On 18Dec2016 16:21, BartC <bc@freeuk.com> wrote:
>On 18/12/2016 10:59, Paul G=C3=B6tze wrote:
>>there is a nice short article by E. W. Dijkstra about why it makes sense
>>to start numbering at zero (and exclude the upper given bound) while
>>slicing a list. Might give a bit of additional understanding.
>>
>>http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF
>
>(This from somebody who apparently can't use a typewriter?!)
>
>I don't know if the arguments there are that convincing. Both lower=20
>bounds of 0 and 1 are useful; some languages will use 0, some 1, and=20
>some can have any lower bound.

0 makes a lot of arithmetic simpler if you think of the index as the offset=
=20
=66rom the start of the array/list.

>But a strong argument for using 1 is that in real life things are=20
>usually counted from 1 (and measured from 0).

Shrug. Yep. But again, if you visualise the index as an offset (=3D=3D "mea=
sure")=20
it is a natural fit.

>Another little anomaly in Python is that when negative indices are used, i=
t=20
>suddenly switches to 1-based indexing! Or least, when -index is considered:

Not if you consider it to count from the range end. So range 0:5 (which in=
=20
Python includes indices 0,1,2,3,4); index -1 places you at 5-1 =3D=3D> 4, w=
hich is=20
consistent. Again, this makes a lot of the arithmetic simpler.

See sig quote for another example of a python style range: birth to death.

Cheers,
--=20
Cameron Simpson <cs@zip.com.au>

There's no need to worry about death, it will not happen in your lifetime.
        - Raymond Smullyan
0
Cameron
12/18/2016 9:56:16 PM
On 18/12/2016 21:04, Michael Torrie wrote:
> On 12/18/2016 09:21 AM, BartC wrote:

>> So if you wanted a simple list giving the titles of the chapters in a
>> book or on a DVD, on the colour of the front doors for each house in a
>> street, usually you wouldn't be able to use element 0.
>
> It also depends on whether you want to number the spaces between the
> objects or the objects themselves. To use your DVD example, the first
> chapter will probably be starting at time zero, not time one.
>
> In another example, babies start out at "zero" years old not "one."  But
> at the same time we refer the first year of life.  Maybe it's not a
> phrase much used these days but it used to be common to say something
> like "in my 15th year," meaning when I was 14.  Maybe a more common use
> would be "the first year of my employment at this company."

There's the fence analogy (perhaps similar to what alister said):

You have a fence made up of one-metre-wide panels that fit between two 
posts.

For a 10-metre fence, you need 11 posts, and 10 panels.

The posts can conveniently be numbered from 0 to 11, as that also gives 
you the distance of each one from the start of the fence.

But posts are thin. Panels are wide, and they might as well be 
conventionally numbered from 1, as you can't use the panel number to 
tell you how far it is from the start (is it the left bit of the panel 
or the right bit?).

Panels obviously correspond to the data in each list element; posts are 
harder to place, except perhaps as alister's commas (but then there have 
to be extra invisible commas at each end).

(The fence needs to divide an open area not surround an enclosed space, 
as otherwise the analogy breaks down; you will have N panels and N posts!)

> I'm not sure it makes sense to having slicing be zero-based but indexing
> itself be 1-based, but I think a case could have been made (though I'm
> glad it was not).

They need to be the same.

(Zero-based necessarily has to be used with offsets from pointers from 
example. In C, array indexing is inextricably tied up with 
pointer/offset arithmetic, so indexing /has/ to be zero-based.

But that doesn't apply in other languages where the choice could have 
been different.)

-- 
Bartc
0
BartC
12/18/2016 10:21:01 PM
On 18/12/2016 22:21, BartC wrote:
> On 18/12/2016 21:04, Michael Torrie wrote:
>> On 12/18/2016 09:21 AM, BartC wrote:
>
>>> So if you wanted a simple list giving the titles of the chapters in a
>>> book or on a DVD, on the colour of the front doors for each house in a
>>> street, usually you wouldn't be able to use element 0.
>>
>> It also depends on whether you want to number the spaces between the
>> objects or the objects themselves. To use your DVD example, the first
>> chapter will probably be starting at time zero, not time one.
>>
>> In another example, babies start out at "zero" years old not "one."  But
>> at the same time we refer the first year of life.  Maybe it's not a
>> phrase much used these days but it used to be common to say something
>> like "in my 15th year," meaning when I was 14.  Maybe a more common use
>> would be "the first year of my employment at this company."
>
> There's the fence analogy (perhaps similar to what alister said):
>
> You have a fence made up of one-metre-wide panels that fit between two
> posts.
>
> For a 10-metre fence, you need 11 posts, and 10 panels.
>
> The posts can conveniently be numbered from 0 to 11,

.... 0 to 10.

That's the thing with zero-based; it might reduce some off-by-one errors 
but could introduce others.

With the panels you have 10 panels numbered 1 to 10; what could be 
simpler or more intuitive?

-- 
Bartc
0
BartC
12/18/2016 10:36:00 PM
BartC <bc@freeuk.com> writes:

> On 18/12/2016 10:59, Paul G�tze wrote:
>> there is a nice short article by E. W. Dijkstra about why it makes sense
>> to start numbering at zero (and exclude the upper given bound) while
>> slicing a list. Might give a bit of additional understanding.
>>
>> http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF
>
> (This from somebody who apparently can't use a typewriter?!)
>
> I don't know if the arguments there are that convincing. Both lower
> bounds of 0 and 1 are useful; some languages will use 0, some 1, and
> some can have any lower bound.
>
> But a strong argument for using 1 is that in real life things are
> usually counted from 1 (and measured from 0).

The index of an element is a measure, not a count.

<snip>
-- 
Ben.
0
Ben
12/19/2016 1:10:55 AM
On 19/12/2016 01:10, Ben Bacarisse wrote:
> BartC <bc@freeuk.com> writes:
>
>> On 18/12/2016 10:59, Paul G�tze wrote:
>>> there is a nice short article by E. W. Dijkstra about why it makes sense
>>> to start numbering at zero (and exclude the upper given bound) while
>>> slicing a list. Might give a bit of additional understanding.
>>>
>>> http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF
>>
>> (This from somebody who apparently can't use a typewriter?!)
>>
>> I don't know if the arguments there are that convincing. Both lower
>> bounds of 0 and 1 are useful; some languages will use 0, some 1, and
>> some can have any lower bound.
>>
>> But a strong argument for using 1 is that in real life things are
>> usually counted from 1 (and measured from 0).
>
> The index of an element is a measure, not a count.

You need to take your C hat off, I think.

-- 
Bartc
0
BartC
12/19/2016 10:26:56 AM
On Sunday, December 18, 2016 at 11:21:38 AM UTC-5, BartC wrote:
> On 18/12/2016 10:59, Paul G=C3=B6tze wrote:
> > Hi John,
> >
> > there is a nice short article by E. W. Dijkstra about why it makes sens=
e
> > to start numbering at zero (and exclude the upper given bound) while
> > slicing a list. Might give a bit of additional understanding.
> >
> > http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF
>=20
> (This from somebody who apparently can't use a typewriter?!)

Don't judge a book by its rendering technology :)   Dijkstra was
"one of the most influential members of computing science's founding
generation": https://en.wikipedia.org/wiki/Edsger_W._Dijkstra

--Ned.
0
Ned
12/19/2016 12:32:35 PM
BartC <bc@freeuk.com> writes:

> On 19/12/2016 01:10, Ben Bacarisse wrote:
>> BartC <bc@freeuk.com> writes:
>>
>>> On 18/12/2016 10:59, Paul G�tze wrote:
>>>> there is a nice short article by E. W. Dijkstra about why it makes sense
>>>> to start numbering at zero (and exclude the upper given bound) while
>>>> slicing a list. Might give a bit of additional understanding.
>>>>
>>>> http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF
>>>
>>> (This from somebody who apparently can't use a typewriter?!)
>>>
>>> I don't know if the arguments there are that convincing. Both lower
>>> bounds of 0 and 1 are useful; some languages will use 0, some 1, and
>>> some can have any lower bound.
>>>
>>> But a strong argument for using 1 is that in real life things are
>>> usually counted from 1 (and measured from 0).
>>
>> The index of an element is a measure, not a count.
>
> You need to take your C hat off, I think.

It's a computing hat.  Indexes are best seen as offsets (i.e. as a
measured distances from some origin or base).  It's a model that grew
out of machine addressing and assembler address modes many, many decades
ago -- long before C.  C, being a low-level language, obviously borrowed
it, but pretty much all the well-thought out high-level languages have
seen the value in it too, though I'd be interested in hearing about
counter examples.

The main issue -- of using a half open interval for a range -- is
probably less widely agreed upon, though I think it should be.  EWD is
correct about this (as about so many things).

-- 
Ben.
0
Ben
12/19/2016 1:48:01 PM
Ben Bacarisse writes:

> BartC writes:
>
>> You need to take your C hat off, I think.
>
> It's a computing hat.  Indexes are best seen as offsets (i.e. as a
> measured distances from some origin or base).  It's a model that grew
> out of machine addressing and assembler address modes many, many
> decades ago -- long before C.  C, being a low-level language,
> obviously borrowed it, but pretty much all the well-thought out
> high-level languages have seen the value in it too, though I'd be
> interested in hearing about counter examples.

Julia, at version 0.5 of the language, is a major counter-example:
1-based, closed ranges. I think they have been much influenced by the
mathematical practice in linear algebra, possibly through another
computing language.

I think there's some work going on to allow other starting points, or at
least 0. Not sure about half-open ranges.

> The main issue -- of using a half open interval for a range -- is
> probably less widely agreed upon, though I think it should be.  EWD is
> correct about this (as about so many things).

Agreed.

(I even use pen and paper, though I don't always remember what I wrote.)
0
Jussi
12/19/2016 2:06:40 PM
On 19/12/2016 13:48, Ben Bacarisse wrote:
> BartC <bc@freeuk.com> writes:

>> You need to take your C hat off, I think.
>
> It's a computing hat.  Indexes are best seen as offsets (i.e. as a
> measured distances from some origin or base).

A 1-based or N-based index can still be seen as an offset from element 
0, if that is what you want, although it might be an imaginary element.

But if you needed a table of the frequencies of letters A to Z, indexed 
by the ASCII codes of those letters, then with 0-based you will either 
waste a lot of entries (table[0] to table[64]), or need to build extra 
offsets into the code (table[letter-ord('A')]), or have to use a 
clumsier, less efficient table (ordered dict or whatever).

An N-based array can simply have bounds of ord('A') to ord('Z') 
inclusive. And the array would have just 26 elements.

   It's a model that grew
> out of machine addressing and assembler address modes many, many decades
> ago -- long before C.  C, being a low-level language, obviously borrowed
> it, but pretty much all the well-thought out high-level languages have
> seen the value in it too, though I'd be interested in hearing about
> counter examples.

Older languages tend to use 1-based indexing. Newer ones zero, but it's 
possible C has been an influence if that is what designers and 
implementers have been exposed to.

Among the 1-based or N-based languages are Fortran, Algol68 and Ada:

https://en.wikipedia.org/wiki/Comparison_of_programming_languages_(array)

(I'd only known 1- or N-based ones so the languages I've created have 
all been N-based with a default of 1. With exceptions for certain types, 
eg: strings are 1-based only; bit-indexing within machine words is 0-based.)

However, if the base for arrays is fixed at 0 or 1, then 0-based is 
probably more flexible, as you can emulated 1-based indexing by adding 
an extra element and ignoring element 0; it's a little harder the other 
way around!

> The main issue -- of using a half open interval for a range -- is
> probably less widely agreed upon, though I think it should be.  EWD is
> correct about this (as about so many things).

With 0-based lists, then half-open intervals are more suited. Otherwise 
a list of N elements would be indexed by a range of [0:N-1]. While 
1-based is more suited to closed intervals: [1:N].

But then, suppose you also had an alternate syntax for a range, of 
[A^^N], which starts at A and has a length of N elements. Then ranges of 
[1^^N] and [0^^N] are both specified the same way. You don't need to 
worry about open or closed intervals.

(I've seen two arrows ^^ used on road signs to denote the length of an 
impending tunnel rather than the distance to it.)

-- 
Bartc


0
BartC
12/19/2016 3:48:05 PM
Jussi Piitulainen <jussi.piitulainen@helsinki.fi> writes:

> Ben Bacarisse writes:
>
>> BartC writes:
>>
>>> You need to take your C hat off, I think.
>>
>> It's a computing hat.  Indexes are best seen as offsets (i.e. as a
>> measured distances from some origin or base).  It's a model that grew
>> out of machine addressing and assembler address modes many, many
>> decades ago -- long before C.  C, being a low-level language,
>> obviously borrowed it, but pretty much all the well-thought out
>> high-level languages have seen the value in it too, though I'd be
>> interested in hearing about counter examples.
>
> Julia, at version 0.5 of the language, is a major counter-example:
> 1-based, closed ranges. I think they have been much influenced by the
> mathematical practice in linear algebra, possibly through another
> computing language.

Interesting.  Thanks.

<snip>
-- 
Ben.
0
Ben
12/19/2016 8:44:39 PM
BartC wrote:
> But if you needed a table of the frequencies of letters A to Z...
> An N-based array can simply have bounds of ord('A') to ord('Z') 
> inclusive.

That's fine if your language lets you have arrays with
arbitrary lower bounds.

But if the language only allows a fixed lower bound, and
furthermore insists that the lower bound be 1, then your
indexing expression becomes:

    table[ord(letter) - ord('A') + 1]

If a language is only going to allow me a single lower
bound, I'd rather it be 0, because I can easily shift
that by whatever offset I need. But if it's 1, often
I need to cancel out an unwanted 1 first, leading to code
that's harder to reason about.

-- 
Greg
0
Gregory
12/19/2016 9:43:35 PM
On Mon, 19 Dec 2016 03:21 am, BartC wrote:

> On 18/12/2016 10:59, Paul Götze wrote:
>> Hi John,
>>
>> there is a nice short article by E. W. Dijkstra about why it makes sense
>> to start numbering at zero (and exclude the upper given bound) while
>> slicing a list. Might give a bit of additional understanding.
>>
>> http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF
> 
> (This from somebody who apparently can't use a typewriter?!)

Oh Bart don't be so hard on yourself, I don't think it matters whether or
not you can use a typewriter, your comments would be as equally interesting
whether you used a typewriter, spoke them allowed and had them transcribed
by voice recognition software, or wrote them out by hand.

(Handwriting is a lost skill in this day and age of people poking at their
JesusPhones with their thumb, typing out txt sp3k 1 ltr at a time.)


> I don't know if the arguments there are that convincing. Both lower
> bounds of 0 and 1 are useful; 

Dijkstra doesn't deny that other bounds can be useful.

The unstated assumption in his argument is you can only pick one system for
indexing into arrays. But of course that's not strictly correct --
different languages can and do choose different schemes, or even choose
different schemes for each language.


> some languages will use 0, some 1, and 
> some can have any lower bound.
> 
> But a strong argument for using 1 is that in real life things are
> usually counted from 1 (and measured from 0).

I'm sure that when European merchants first learned of this
new-fangled "zero" from their Arabic and Indian colleagues in the Middle
Ages, they probably had a quite similar reaction.

After all, didn't the Greeks themselves claim that the smallest number was
two? Zero is clearly not a number, it is the absence of number, and 1 is
not a number, it is unity.

http://math.furman.edu/~mwoodard/fuejum/content/2012/paper1_2012.pdf

You do make a point that even children (at least those about the age of
three or four) can understand 1-based indexing, but I don't think it is a
*critical* point. Most programmers have learned a bit more than the average
four year old.

If you can understand threads, OOP design patterns, binary search or C
pointers, counting indexes 0, 1, 2, 3, ... should hold no fears for you.


> So if you wanted a simple list giving the titles of the chapters in a
> book or on a DVD, on the colour of the front doors for each house in a
> street, usually you wouldn't be able to use element 0.
> 
> As for slice notation, I tend to informally use (not for any particulr
> language) A..B for an inclusive range, and A:N for a range of length N
> starting from A.

An interesting thought... I like the idea of being able to specify slices by
index or by length.


 
> In Python you can also have a third operand for a range, A:B:C, which
> can mean that B is not necessarily one past the last in the range, and
> that the A <= i < B condition in that paper is no longer quite true.

You cannot use a simple comparison like A <= i < B to represent a range with
a step-size not equal to 1. How would you specify the sequence:

    2, 4, 6, 8, 10, 12, 14

as a single expression with ONLY an upper and lower bound?

    2 <= i <= 14

clearly doesn't work, and nor does any alternative.

Dijkstra doesn't discuss range() with non-unit step sizes.


> In fact, A:B:-1 corresponds to A >= i > B, which I think is the same as
> condition (b) in the paper (but backwards), rather (a) which is favoured.

I have no way of knowing what it corresponds to in your head, but in Python,
a slice A:B:-1 corresponds to the elements at indices:

    A, A-1, A-2, A-3, ..., B+1

*in that order*.

py> '0123456789'[7:2:-1]
'76543'

Since Dijkstra doesn't discuss non-unit step sizes, it is unclear what he
would say about counting backwards.


> Another little anomaly in Python is that when negative indices are used,
> it suddenly switches to 1-based indexing! Or least, when -index is
> considered:
> 
>    x = [-4,-3,-2,-1]
> 
>    print x[-1]       # -1  Notice the correspondence here...
>    print x[-2]       # -2

Congratulations, you've discovered that 4-1 equals 3. Well done!

>    x = [1, 2, 3, 4]
> 
>    print x[1]        # 2   ...and the lack of it here
>    print x[2]        # 3

That's silly. You can get whichever correspondence you like by choosing
values which do or don't match the indices used by Python:

    x = [-3, -2, -1, 0]
    print x[-1]  # Notice the lack of correspondence

    x = [0, 1, 2, 3]
    print x[1]  # Notice the correspondence here

    x = [6, 7, 8, 9]
    print x[1]  # Notice the lack of correspondence here

What's your point?


I infer that you're having difficulty with Python slicing and indexing
because you have a mental model where indexes label the items themselves
(best viewed in a fixed width font):

+---+---+---+---+---+---+
| P | y | t | h | o | n |
+---+---+---+---+---+---+
...0...1...2...3...4...5        # Indexes
..-6..-5..-4..-3..-2..-1        # 6-1 = 5, 6-2 = 4 etc.


That makes 'Python'[1] easy to visualise (its obviously just 'y') but
slicing becomes more complex:

    'Python'[1:4]
    => *include* index 1, *exclude* index 4
    => 'yth'

But a better model is to consider the *boundaries* labelled:

+---+---+---+---+---+---+
| P | y | t | h | o | n |
+---+---+---+---+---+---+
0...1...2...3...4...5...6        # Indexes
-6..-5..-4..-3..-2..-1


A simple index is a bit more complex (think of it as the element immediately
following the boundary, hence 'Python'[1] is still 'y') but slicing now
becomes simple. Visualise slicing along the boundaries (that's where the
name comes from!) and now it is obvious that the start index is included
and the stop index is excluded/

+---+---+---+---+---+---+
| P | y | t | h | o | n |
+---+---+---+---+---+---+
.....^...........^
cut here    and here


There's no mental model that makes slicing with non-zero step sizes easy,
*especially* if the step size is a non-unit negative number (say, -3). For
those you really have to simulate the process of actually iterating over
the indexes one by one.



-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.

0
Steve
12/20/2016 12:49:19 AM
On 20/12/2016 00:49, Steve D'Aprano wrote:
> On Mon, 19 Dec 2016 03:21 am, BartC wrote:
>
>> On 18/12/2016 10:59, Paul Götze wrote:
>>> Hi John,
>>>
>>> there is a nice short article by E. W. Dijkstra about why it makes sense
>>> to start numbering at zero (and exclude the upper given bound) while
>>> slicing a list. Might give a bit of additional understanding.
>>>
>>> http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF

> If you can understand threads, OOP design patterns, binary search or C
> pointers, counting indexes 0, 1, 2, 3, ... should hold no fears for you.

Who's talking about me? In the area of scripting languages that could be 
used by people with a wide range of expertise, I would have expected 
1-based to be more common.

>> In Python you can also have a third operand for a range, A:B:C, which
>> can mean that B is not necessarily one past the last in the range, and
>> that the A <= i < B condition in that paper is no longer quite true.
>
> You cannot use a simple comparison like A <= i < B to represent a range with
> a step-size not equal to 1. How would you specify the sequence:
>
>     2, 4, 6, 8, 10, 12, 14
>
> as a single expression with ONLY an upper and lower bound?

>     2 <= i <= 14
>
> clearly doesn't work, and nor does any alternative.

I didn't mean that the expression specifies the range, but that an value 
in the range satisfies the expression.

>> In fact, A:B:-1 corresponds to A >= i > B, which I think is the same as
>> condition (b) in the paper (but backwards), rather (a) which is favoured.
>
> I have no way of knowing what it corresponds to in your head, but in Python,
> a slice A:B:-1 corresponds to the elements at indices:

I got the idea for a minute that, since A:B:C specifies a set of 
indices, and that range(A,B,C) specifies the same set of values, that 
they are interchangeable. But of course you can't use x[range(A,B,C)] 
and you can't do 'for i in A:B:C'. [In my, ahem, own language, you can 
just that...]

>     A, A-1, A-2, A-3, ..., B+1
>
> *in that order*.
>
> py> '0123456789'[7:2:-1]
> '76543'

When A is 7, B is 2 and C is -1, then

for i in range(-1000,1000):
     if A >=i > B:
         print (i)

displays only the numbers 3,4,5,6,7; the same elements you get, and they 
satisfy A >= i > B as I said.

>> Another little anomaly in Python is that when negative indices are used,
>> it suddenly switches to 1-based indexing! Or least, when -index is
>> considered:

> That's silly. You can get whichever correspondence you like by choosing
> values which do or don't match the indices used by Python:
>
>     x = [-3, -2, -1, 0]
>     print x[-1]  # Notice the lack of correspondence
>
>     x = [0, 1, 2, 3]
>     print x[1]  # Notice the correspondence here
>
>     x = [6, 7, 8, 9]
>     print x[1]  # Notice the lack of correspondence here

> What's your point?

That point is that indices go 0,1,2,3,... when indexed from the start, 
and -1,-2,-3,... when indexed from the end.

That means that the THIRD item from the start has index [2], but the 
THIRD item from the end has index [-3].

(So if you reversed a list x, and wanted the same x[i] element you had 
before it was reversed, it now has to be x[-(i+1)]. 1-based, the 
elements would be x[i] and x[-i], a little more symmetric. But with 
N-based, you wouldn't be able to index from the end at all.)

> +---+---+---+---+---+---+
> | P | y | t | h | o | n |
> +---+---+---+---+---+---+
> ....^...........^
> cut here    and here

That's the fence/fencepost thing I mentioned elsewhere. It comes up also 
in graphics: if these boxes represent pixels rather than elements, and a 
function draws a line from pixel 1 to pixel 4 or fills then in, then 
should pixel 4 be filled in or not?

But if you think of these values as continuous measures from the left 
edge, rather than as discrete units, and denote them as 1.0 to 4.0, then 
it becomes more obvious. The 3 pixels between 1.0 and 4.0 are filled in.

-- 
bartc
0
BartC
12/20/2016 11:45:07 AM
Reply: