Exercise 4: Dynamic Arrays and Strings

Prerequisite

You have completed Exercise 3.

Learning Outcomes

Be comfortable writing simple C programs that involve dynamically allocated arrays and strings.

Deadline

This is part of the CS1010 formative assessment. Submit before 16 October 2023, 23:59 to receive feedback and earn your achievement "badges".

Acceptance Link

The link to accept the exercise is not available publicly. Visit Canvas for the link.

Concepts and Difficulty

	Question	String	Character	Dynamic-Size Array	Difficulty
1	Word
2	Line
3	Up^
4	List
5	Length^
6	Concat^
7	Kendall
8	Search
9	Subtract

^: Questions 3, 5, and 6 serve as in-class exercises during the lab session in Week 8.

CS1010 I/O Library

You may need to use a new collection of functions available in the CS1010 I/O library for this exercise onwards.

Reading

cs1010_read_size_t
cs1010_read_word
cs1010_read_line
cs1010_read_long_array
cs1010_read_double_array

Note that, for this exercise, you should avoid cs1010_read_word_array and cs1010_read_line_array unless you know what you are doing, as it involves 2D arrays.

Printing

cs1010_print_size_t and cs1010_println_size_t

Separate Compilation

In some questions, you will see that the main function is located in a separate C file, with a -main.c suffix, e.g., length-main.c. The function that you are tasked to write is located in the usual <problem>.c file (e.g., length.c).

In these cases, you are not allowed to change the C file that contains the main function.

Hidden Test Cases

From this exercise onwards, we may test your code more extensively using additional test cases not included in the skeleton. You should test your code extensively using your own test cases.

Question 1: Word

Using cs1010_read_word, write a program called word that reads a word from the standard input and prints the word back to the standard output.

The input is guaranteed to have at least one word.

Sample Runs

ooiwt@pe113:~/ex04-ooiwt$ ./word
hello
hello
ooiwt@pe113:~/ex04-ooiwt$ ./word
Bye bye, see you later!
Bye

Question 2: Line

Using cs1010_read_line, write a program called line that reads a line of text from the standard input and prints the line back to the standard output.

Pay attention to the printing of new lines. The input is guaranteed to have at least one line (but the line may be empty).

Sample Runs

ooiwt@pe113:~/ex04-ooiwt$ ./line
hello
hello
ooiwt@pe113:~/ex04-ooiwt$ ./line
Bye bye, see you later!
Bye bye, see you later!

Question 3: Up

Using cs1010_read_line, write a program called line that reads a line of text from the standard input and prints the line back to the standard output in upper case.

You are not allowed to use C standard library functions islower or toupper to solve this. Instead of islower, compare if a character is between 'a' and 'z' to determine if it is a lowercase character. Instead of toupper, add the offset 'A' - 'a' to a lower case character.

The input is guaranteed to have at least one line and the line may be empty.

Sample Runs

ooiwt@pe113:~/ex04-ooiwt$ ./up
hello
HELLO
ooiwt@pe113:~/ex04-ooiwt$ ./up
Bye bye, see you later!
BYE BYE, SEE YOU LATER!

Question 4: List

Write a program list that reads a list of integers from the standard input, stores it in a dynamically allocated array, and then prints out the list in reverse order, one number per line.

The first integer read from the standard input is \(k\) (\(k > 0\)), the number of elements of the list. The next \(k\) integers are the elements of the list.

Sample Run

ooiwt@pe101:~/ex04-ooiwt$ ./list
3
100 -120 130
130
-120
100

Question 5: Length

Write a program length that reads a line of text from the standard input and prints the number of characters in the line (including the newline character) to the standard output.

You may use cs1010_read_line to read the line of text from the standard input.

Fill in the following method in the given skeleton code:

size_t length_of(const char *str) {
    // fill in the function
}

The goal of this exercise is to let you practice processing and manipulating strings in C. Solve this problem directly without using any C library functions from string.h.

Sample Run

ooiwt@pe101:~/ex04-ooiwt$ ./length
Hello world!
13

Question 6: Concat

Write a program concat that reads two words from the standard input, append them into a new string, and then print a new string, to the standard output.

You may use cs1010_read_word to read a word from the standard input.

Fill in the following method in the given skeleton code:

char* concatenate(const char *str1, const char *str2) {
    // fill in the function
}

The function is responsible for allocating the required amount of memory for the resulting string. The caller of concatenate is responsible for deallocation.

The goal of this exercise is to let you practice processing and manipulating strings in C. Solve this problem directly without using any C library functions from string.h. You may use the method length_of you have written above.

Sample Run

ooiwt@pe101:~/ex04-ooiwt$ ./concat
cs
1010
cs1010

Question 7: Kendall

Suppose that we are given a set of items and we ask two different parties to rank the items according to some order. We may get two different orders of the items. How do we measure how similar (or dissimilar) the two rankings are?

For example, consider a search engine that returns a list of web pages ranked by their relevance to the search query. A user may not always agree with the ranking of the search engine and may judge the relevance of the search result differently, i.e., the user may have his or her own ranking. This measurement of similarity between the ranking by the search engine and the ranking by the user gives us a metric on how good the search engine result is. The more similar it is to the ranking of the user, the better the search engine is in ranking in the search results.

One way to measure the similarity of the two rankings is the Kendall tau distance. You will write a program kendall that calculates the normalized Kendall tau distance for this question.

We will represent a ranking by the order of the items. The first item is ranked 1, the second is ranked 2, and so on. To simplify the problem, we take one of the rankings that we want to calculate the Kendall tau distance on and label the items sequentially, as the sequence 1, 2, 3, 4, 5, ... \(n\), where n is the number of items. We call this the base ranking. The other ranking will then be a permutation of the numbers 1 to \(n\).

For example, suppose we have three items A, B, and C. The first ranking ranks the items as B, C, and A. The second ranking ranks the items C, A, B. After relabelling the first ranking as 1, 2, 3, the second ranking becomes 2, 3, 1.

The Kendall tau distance counts the pairs of items in one ranking that are ranked in a different order in the other ranking. In the example above, we have three possible pairs:

Pair	Ranking 1	Ranking 2
A-B	B then A	A then B
A-C	C then A	C then A
B-C	B then C	C then B

Out of the three pairs, the pairs A-B and B-C are ordered differently in the two rankings, so that Kendall tau distance is 2.

The normalized Kendall tau distance is the ratio of the number of pairs ranked in a different order to the number of all possible pairs.

In the example above, the normalized Kendall distance is 2/3 = 0.6666.

Your program should read the following from the standard input:

The first positive integer, \(n\), is the number of items (\(n \ge 2\)).
The next \(n\) numbers are a permutation of integers between 1 to \(n\). This corresponds to the ranking of the items from 1 to \(n\).

Your program should print the normalized Kendall tau distance between the ranking read above and the base ranking (1, 2, 3, .. \(n\)) to the standard output.

Your program must not assume any limit to the length of the input, except that their length can fit into the type size_t.

As a guideline for efficiency, your code should take no more than 5s for the given test cases.

Sample Runs

ooiwt@pe118:~/ex04-ooiwt$ ./kendall
3
2 3 1
0.6667
ooiwt@pe118:~/ex04-ooiwt$ ./kendall
10
1 2 3 4 5 6 7 8 9 10
0.0000
ooiwt@pe118:~/ex04-ooiwt$ ./kendall
6
6 5 4 3 2 1
1.0000

Question 8: Search

Write a program search that finds a list of words \(w_0, w_1, .. w_{k-1}\) appear in a given string \(s\).

The program reads the following from the standard input:

The first line of the input is the string \(s\)
The next line is an integer \(k\)
The next \(k\) lines contain one word each.

The program then prints \(k\) lines to the standard output:

If \(w_i\) is a substring of \(s\), then print the position of the first occurrence of \(w_i\) within \(s\). The first character in \(s\) has position 0, the second character has position 1, etc.
If \(w_i\) is not a substring of \(s\), then print the string "not found".

You may use cs1010_read_word to read the list of words \(w_0..w_{k-1}\) one-by-one from the standard input. Avoid cs1010_read_word_array for this exercise, unless you know what you are doing.

Sample Runs

ooiwt@pe101:~/ex04-ooiwt$ ./search
haystack
1
needle
not found
ooiwt@pe101:~/ex04-ooiwt$ ./search
haystack 
2
hay
stack
0
3
ooiwt@pe101:~/ex04-ooiwt$ ./search
When life gives you lemons, make lemonade.
4
life
lemon
,
lime
5
20
26
not found

Question 9: Subtract

In this question, you are asked to write a program that subtracts two non-negative numbers which can be arbitrarily large.

The types provided by C can only represent a number up to a certain value. Thus, for this question, we will represent an integer using an arbitrarily long string consisting of characters (of type char) '0' to '9' (note: not integer 0 to 9).

C supports arithmetic operations on char values as well. To convert a digit character to the numerical value of a digit character, we can do the following:

To convert from a digit character to its numerical value, we subtract the char '0'. For instance, '6' - '0' will give us the value 6.
To convert from a numerical value of a digit to its character, we add the char '0'. For instance, 6 + '0' will give us the character '6'.

Write a program subtract that reads, from the standard input, two non-negative numbers \(a\) and \(b\), represented as strings consisting of digits '0' to '9' (\(a \ge b\)). The program then prints \(a - b\) to the standard output.

You will likely need to use the C standard library function strlen, which returns you the number of characters in a string (excluding the terminating '\0'). Look up how to use this function on your own. Alternatively, you can use length_of which you have written in Question 5 above.

Your program must not assume any limit to the length of the input, except that their length can fit into the type size_t.

In particular, you must not convert the input string to an integer before subtracting, since if the input string is long enough, it would not fit into a C integer type.

Sample Runs

ooiwt@pe119:~/ex04-ooiwt$ ./subtract
2
1
1
ooiwt@pe119:~/ex04-ooiwt$ ./subtract
10
9
1
ooiwt@pe119:~/ex04-ooiwt$ ./subtract
1000000
1
999999
ooiwt@pe119:~/ex04-ooiwt$ ./subtract
19330911933091193309119330911933091
1400060514000605140006051400605
19329511872577192703979324860532486