C++ Tutorial: The Do's and Don'ts of Accessing One Element Past the End of an Array

Friday Mar 11th 2011 by Danny Kalev
Share:

A buffer overflow is the result of writing to an element that is outside the boundaries of an array. And yet, certain algorithms need to access the address of one element past the end of an array, albeit with a few important restrictions. Here's the why and how.

Introduction

Seemingly, the only safe option is to never access elements that are outside the valid boundaries of an array. However, there are cases when you need to access a memory address pointing to one element past the end of an array: when traversing an array, or in algorithms that manipulate a sequence of elements. As opposed to a common belief, C++ programming does permit access to the address of one element past the end of an array. However, you have to do that very carefully, paying attention to several important restrictions.

Sorting an Array

Accessing the address of one element past the end of an array is required more often than you think. Suppose you have an array of integers that you want to sort using the std::sort() algorithm:

  int arr[5]= {3,4,89,7,0};
  std::sort(arr, arr+5);

Sort() requires two forward iterators (recall that pointers are perfectly valid iterators): an iterator indicating the first element of a sequence, and another iterator indicating the end of the sequence. Notice that the end of a sequence is not the last valid element of the array, i.e., arr[4]. Rather, it's the address of one element past the last valid element, namely arr+5. Any attempt to dereference arr+5 would result in undefined behavior. However, the address arr+5 itself is valid for certain purposes.

Valid and Invalid Operations

Taking the address of one element past the end of an array is safe and permitted, so long as you're not using that address to read or write to the data to which the address is pointing. Additionally, you're not allowed to increment that pointer any further. However, you may decrement that pointer. Additionally, you can use that address in pointer comparison expressions, as in the following example:

   for (int *p=arr; p<arr+5; p++)
     *p=0; // clear the array
   
   Or even like this:
   
   int n=5; 
   while(n) //a more cumbersome method of clearing the array 
   {
    *(arr+5-n)=0;
    n--;
   }
   
   In contrast, dereferencing the expression arr+5 is undefined:
   
   if (*(arr+5)) //undefined behavior
    x++;
 

Generally speaking, the Standard allows you to use arr+5 only as a pointer, never as the value to which it's pointing.

  vector <int> vi;
  vi.push_back(1);
  vector<int>::iterator it= vi.end(); 
  *(--it)=8; //OK, assigns 8 to vi[0]
  ++it; //advance to one past the last valid element
  if (it==vi.end()) //OK, comparison 
   cout<<you have reached the end of the vector"<<endl;
  cout<<*it<<endl; //undefined behavior; dereferencing 

Summary

Accessing the address of one past the last element of an array is a valid operation under certain conditions. You can use that address only in pointer arithmetic expressions that access valid elements of the array, and in comparisons. You're not allowed to dereference the result nor can you increment the pointer any further (say reaching the third element past the array's end). Notice that STL containers follow this idiom. The end() member function returns an iterator pointing to one element past the last element of the container. You may use the iterator returned from end() only in comparisons and in expressions that access valid elements of the container:

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved