Skip to main content
  1. Articles/

Dabbling with structures and unions

·7 mins
Mayukh Datta
Technical C

There is a lot more to structures and unions. I’ll go through what I’ve learned about them here. I’m assuming that you’re not a beginner in C language.

To store or manipulate data in the computer’s memory we need to write a computer program that’ll instruct the compiler or the interpreter to allocate space for the data in the memory. We give a name to that location in the memory to make retrieval and usage of data easy. That’s what we call a variable. Variable is the named location in the memory where we have stored some data. The data can be an integer number, floating-point number, character, or boolean. Data types classify what type of data the programmer intends to use.

There are two kinds of data types in C:

  • Primitive data types: They are provided by the programming language such as char, int, bool, float, double, etc. as basic building blocks.
  • User-defined data types: They are the ones waiting for you to create them. Such as arrays, pointers, structures, unions, and enums.

‘Unity in diversity’ data types #

We use variables to store a value, and arrays to store multiple values of the same data type. But what if we want something that could store items of different data types? A data type that would unite more than one diverse datatypes. C language has features like structures and unions that allow us to group items of different data types into a single user-defined type.

Suppose, you’ve to write a program to take the details of several books as input and print the details as output. The details of a book mean its book ID, published date, title, author’s name, price, and type, i.e. paperback or hardcover. Also, you’ve to calculate the delivery charges required to ship a book and then print its final price. If the book is a paperback and priced below Rs. 1000 then charge Rs. 40 for delivery, or if it’s a hardcover and priced below Rs. 1500 then charge Rs. 50 for delivery.

C code:

#include<stdio.h>
#include<stdlib.h>

int main(void){

 /** enum - user defined data type used to assign names 
 * to integral constants for the ease of readability
 * and maintainability. If not assigned any value
 * explicitly, the default values are 0, 1, 2,...
 * 
 * typedef - used to give a new name to any data type.
 **/

 typedef enum {Paperback, Hardcover} BookType;

 /** defining a Book ADT (abstract data type)
 * using structures. Book can now store items
 * of different data types. **/

 typedef struct{
  char Title[50], PublishedDate[11], AuthorName[30];
  int BookID;
  double Price;
  BookType Type;
 }Book;

 /** To store details of several books, we need an array.
 * So, let's declare an array of structures (Book) **/

 Book BookData[100]; //It can store details of 100 books at most.
 char choice;
 int i=0, j;

 do{
  printf("\nEnter Book's Title: ");
  fgets(BookData[i].Title, sizeof(BookData[i].Title), stdin);
  printf("Enter Author's Name: ");
  fgets(BookData[i].AuthorName, sizeof(BookData[i].AuthorName), stdin);
  printf("Enter the date of publication (DD/MM/YYYY): ");
  fgets(BookData[i].PublishedDate, sizeof(BookData[i].PublishedDate), stdin);
  fflush(stdin);
  printf("Enter Book ID: ");
  scanf("%d", &BookData[i].BookID);
  printf("Enter Book price: ");
  scanf("%lf", &BookData[i].Price);
  printf("Enter Book Type (0 for Paperback/1 for Hardcover): ");
  scanf("%u", &BookData[i].Type);
  while ((getchar()) != '\n');

  if(BookData[i].Type==Paperback && BookData[i].Price<1000){
   BookData[i].Price += 40;
  }else if(BookData[i].Type==Hardcover && BookData[i].Price<1500){
   BookData[i].Price += 50;
  }
  
  printf("Want to enter details of another book? (Y/N): ");
  scanf("%c", &choice);
  while ((getchar()) != '\n');
  i++;
 }while(choice=='y' || choice=='Y');
 
 for(j=0;j<i;j++){
  printf("\nBook #%d\n", j+1);
  printf("-----------------\n");
  printf("Book ID: %d\n", BookData[j].BookID);
  printf("Title of Book: %s", BookData[j].Title);
  printf("Author's Name: %s", BookData[j].AuthorName);
  printf("Date of Publication: %s", BookData[j].PublishedDate);
  if(BookData[j].Type == Paperback)
   printf("Book Type: Paperback\n");
  else if(BookData[j].Type == Hardcover)
   printf("Book Type: Hardcover\n");
  printf("Price: Rs. %lf\n", BookData[j].Price);
  printf("-----------------\n");
 }
}

Inside the memory #

Structures are always stored in a contiguous memory location and hence each field within a structure is located at a certain fixed offset from the start. The compiler allocates a unique storage area of location in the memory for each of the members/fields of a structure and therefore altering the value of any member won’t affect the value of other members.

In case of a 32-bit compiler, how many bytes does each member take?
Title = 50 bytes
PublishedDate = 11 bytes
AuthorName = 30 bytes
BookID = 4 bytes
Price = 8 bytes
Type = 4 bytes
Total Bytes = 107 bytes
But sizeof(Book) says that the structure took 112 bytes.

Our assumption about the structure size was 107 bytes but the actual size is 112 bytes. The size of a structure is greater than or equal to the sum of the sizes of its members. The reason for the structure size is greater is that the compiler inserts one or more empty bytes between memory addresses in order to align the data in memory. This concept is called structure padding. It’s very useful to increase the processor speed.

The way we can have pointers to an int or a double, similarly, we can have pointers to a structure.

    #include<stdio.h>
    int main(void){
        typedef struct{
            int length, width;
        }Line;
        Line L, *Lp;
        L.length = 12;
        Lp = &L; //pointer Lp points to L
        Lp->width = 2;
        printf("Length: %d, Width: %d\n", L.length, Lp->width);
        /** We don't use dot operator(.) rather we use member access 
        * operator(->) for pointers to access structure members. **/
    }
    /** Prints
    * Length: 12, Width: 2
    **/

We finished the structure definition and we declared the Lp variable. We can also declare pointers to structures before finishing the structure declaration. C allows us to declare pointers to incomplete types. It is called a self-referencing pointer(data type). It’s an example of recursive data type.

    //A Linked List Node
    typedef struct Node{
        int data;
        struct Node \*next;
    }Node;

This structure contains pointer members that can point to different objects of the same type.

C won’t allow to self-reference member objects of any other type.

    typedef struct Book{
        int price;
        struct Book weight; //prompt error
    }Book;

We can also nest one structure within another structure.

    #include<stdio.h>
    #include<string.h>
    int main(void){
     typedef struct{
      char Name[30];
      struct Address{
       char City[20];
       int Pin;
      }A;
     }Person;
     Person P;
     strcpy(P.Name, "Xyz Ijk");
     strcpy(P.A.City, "Jamaica");
     P.A.Pin = 569235;
     printf("Name: %s, City: %s, Pin: %d\n", P.Name, P.A.City, P.A.Pin);
    }
    /** Prints
    * Name: Xyz Ijk, City: Jamaica, Pin: 569235
    **/

Like structures, unions can group items of different data types. But there are significant differences between them. Unlike structures, the compiler allocates the same memory location for all the members of a union and they all share it. Therefore, altering the value of any of the members will alter other member values too. Hence, the recently updated member is the only member that you can work with. The size of a union is equal to or greater than the size of its largest member.

    #include<stdio.h>
    #include<string.h>
    int main(void){
     typedef union{
      int A, B;
      char C[10];
     }UnionD;
     printf("Size of the union: %ld\n", sizeof(UnionD));
     UnionD P; //union variable declared
     P.A = 20; //Value of B and C affected
     printf("A = %d, B = %d, C = %s\n", P.A, P.B, P.C);
     P.B = 10; //Value of A and C affected
     printf("A = %d, B = %d, C = %s", P.A, P.B, P.C);
     strcpy(P.C, "Godfather"); //Value of A and B affected
     printf("A = %d, B = %d, C = %s\n", P.A, P.B, P.C);
    }
    /** Prints
    * Size of the union: 12
    * A = 20, B = 20, C =  
    * A = 10, B = 10, C = 
    * A = 1717858119, B = 1717858119, C = Godfather
    **/

Unions can be nested too. We can also put unions in structures and vice versa. Pointers to unions are also allowed to be declared. We use unions when we need to work with any one of the members at a time. But, in my opinion, try not to use unions unless you are working in a memory-constrained environment. Since its members share the same memory, ambiguity is more.

So, we can design complex abstract data types and data structures using both structures and unions.

References: