C / C++
C was originally developed at Bell Labs by Dennis Ritchie between 1972 and 1973
to make utilities running on Unix. Later, it was applied to re-implementing the
kernel of the Unix operating system. During the 1980s, C gradually gained
popularity. It has become one of the most widely used programming languages,
with C compilers from various vendors available for the majority of existing
computer architectures and operating systems. C has been standardized by the
International Organization for Standardization (ISO).
Installation
Installing A C compiler on Windows is a bit opaque. Here are notes from what I remember
the last time I embarked on this journey.
Clang works on Windows, the path was located here:
C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\Llvm\bin
Once you have added this directory to your path, you can use Clang as you would normally.
<stdint.h>
Specifying Integer Size
Sometimes you know that you're only going to need a number that goes up to 255.
It doesn't make sense to create a full int since that is typically encoded as
4-bytes large. A single byte is capable of coding numbers up to 255.
Luckily, C allows you to specifically declare the size of your integer, with the following syntax.
| Bytes | Bits | Signed Type | Unsigned Type |
|---|
| 1 | 8 | int8_t | uint8_t |
| 2 | 16 | int16_t | uint16_t |
| 4 | 32 | int32_t | uint32_t |
| 8 | 64 | int64_t | uint64_t |
You're technically using some of these without even knowing it. Assuming you're
running C on a 64-bit system, these following types are equivalent:
char is a int8_t, an unsigned 1-byte integershort is a int16_t, a signed 2-byte integerint is a int32_t, a signed 4-byte integerlong is a int64_t, a signed 8-byte integersize_t is a uint64_t, an unsigned 8-byte integer
GNU C Compilers
gcc is the Gnu C compilerg++ is the Gnu C++ compiler
However, with this being said, gcc is a fully functional C++ compiler, and
g++ is effectively just a mapping to gcc -xc++ -lstdc++ -shared-libgcc
Setting the Version
In C
In C++
Using clang++:
Using g++:
g++ -std=c++1z # (ISO c++ standard)
g++ -std=gnu++1z # (GNU c++ standard)
Checking Processor
```shell
gcc -march=native -Q --help=target | grep -- '-march=' | cut -f3 | head -n 1
# on macOS: 'haswell'
# on AWS EC2: 'skylake-avx512'
# on Raspberry Pi: 'cortex-a72'
```
Import the <stdio.h> library to get started
The snprintf() function can be used to write a formatted string to a buffer. The arguments supplied are as follows:
char* bufferint sizeconst char* format<arg1>, <arg2>, ...
The puts() function writes a C-string to stdout, stopping at the first null terminator \0 found. It additionall appends a newline \n to the output. To print to a particular stream instead of stdout, and/or to avoid the insertion of a trailing newline, \n instead of stdout, use the fputs() function instead.
Printing a formatted string to stdout:
#include <stdio.h>
#include <stdint.h>
int main() {
uint8_t size = 100;
char buffer[size];
const char* format = "It takes %hhu to tango.";
uint8_t value = 2;
snprintf(buffer, size, format, value);
puts(buffer);
}
Parsing Strings
I've found a very smooth way to split strings is by iterating through tokens
generated to match a regular expression. See below:
#include <regex>
#include <string>
#include <iostream>
int main() {
std::string content("1 - line #1\r\n2 - line #2\r\n3 - line #3");
// generate an iterator through each line token, excluding the <CR><LF>
// note: exclusion of the token itself is denoted by the `-1` provided as the 4th
// argument to the constructor of the std::regex_token_iterator
std::regex crlf("\\r\\n");
auto line{std::sregex_token_iterator{content.begin(),
content.end(),
crlf,
-1}};
while (line != std::sregex_token_iterator()) {
std::cout << "------" << std::endl;
std::cout << "[" << *line << "]" << std::endl;
std::string buffer(line->str());
// for the number, omit the (-1) previously provided
// which will cause this to tokenize the matches to the
// regular expression itself
std::regex blackspace("\\S+");
auto text{std::sregex_token_iterator{buffer.begin(),
buffer.end(),
blackspace,
}};
while (text != std::sregex_token_iterator()) {
std::cout << "\t" << "(" << *text << ")" << std::endl;
++text;
}
++line;
}
std::cout << "------" << std::endl;
}
------
[1 - line #1]
(1)
(-)
(line)
(#1)
[2 - line #2]
(2)
(-)
(line)
(#2)
[3 - line #3]
(3)
(-)
(line)
(#3)
Splitting Strings
There's no split() function for strings... until now!
Split a string on each occurance of a regular expression pattern:
std::vector<std::string> split(const std::string &text, const std::regex ®ex) {
return std::vector<std::string>{
std::sregex_token_iterator{
text.begin(),
text.end(),
regex,
-1
},
std::sregex_token_iterator()
};
}
Split text on every occurance of a carriage return, followed by a newline
std::regex crlf{"\\r\\n"};
for (auto& i : split(text, crlf)) {
std::cout << "[" << i << "]" << std::endl;
}
Slurping Files
Initialization
Initializer List
You can use Initializer Lists to zero out elements.
Initializing a 2D array with initializer lists even supports implied zero values, see below.
Useful <stdio.h> Commands
gets: Reads a c-string from stdin
fputc: Writes a character to a stream
fgetc: Reads a character from a stream
remove: Removes a file
rename: Renames a file
fopen: Opens a file
fwrite: Writes a file
fclose: Closes a file
setbuf: Sets the stream buffer
C++ Style Guide
If you develop software on a MacOS, you might want to use the style guide, which is compatible with clang's auto-formatting capabilities.
The style guide notes outlined below are taken directly from Apple's llvm and outline some of the details I found to be the most noteworthy.
Avoid writing #include <iostream> unless you really need more functionality than <stdio.h> is capable of providing with its trusty printf() function
Avoid writing using namespace std. It pollutes the global namespace and can cause collisions which are frustrating to debug
Avoid using std::endl as a generic substitute for \n. The command std::endl calls os.put('\n') and then calls os.flush() which is likely more computationally intensive than you desire.
Avoid using post-increment operators. Favor using pre-increment operators. They are computationally faster and save you from painful debugging later on.
Avoid throwing exceptions. Most exceptions can be avoided with good programming practices, and they are computationally expensive to handle.
When coding in C++, favor using std::static_cast<T>(var) over C-style casts, (e.g. (T) var).
Formatting time and date
A peer in my operating systems class asked:
Since there is no specific format required, I just wanted to check if this
format is acceptable for date:
1:35, 1-24-2021 // represents 1:35 AM UTC, January 24 2021
13:35, 1-24-2021 // represents 1:35 PM UTC, January 24 2021
I'm not a CP/TA for CSCI 350, but I know of multiple classes where points are
taken off for students who encode time this way.
For example, the string representation of 1:02 AM UTC, March 4, 0005, using
this format, would be:
Which would be a bit confusing. To solve this, ISO came together in 1988 and
created a timestamp format, and published it under ISO 8601, with the most
common implementation being that documented in RFC 3339. Typically for
computers, time as a string will be compliant with these formats.
The main reason: A list of ISO 8601 compliant timestamps, when sorted
lexicographically, are also in chronological order
That example written above in an ISO 8601 compliant format would be
0005-04-03T01:02:00+00:00
In C/C++ You can use strftime() to create ISO 8601 / RFC 3339
compliant timestamp strings.
Sources:
- my burning hatred for dealing with date/time
- data engineering internship
Inheritance
class example {
demo();
};
class::demo(){}
In the example above, the :: is called the scope resolution operator, which
points out to the compiler that the function demo() is a member to the class
example.
CMake
Adding GoogleTest to a C++ project
It's late, so I'm putting this here so I don't forget it.
Project file CMakeList.txt, at the root of the project
# ==============================================================================
# Add test cases using Google Test
# ==============================================================================
# Google Test
set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED True)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -g -Wall --std=c++17")
# all the remaining CMake code will be placed here
include(FetchContent)
fetchcontent_declare(
googletest
GIT_REPOSITORY https://github.com/google/googletest.git
GIT_TAG release-1.11.0
)
fetchcontent_makeavailable(googletest)
# Suppress warning message displayed when including Google Test
if (NOT DEFINED CMAKE_SUPPRESS_DEVELOPER_WARNINGS)
set(CMAKE_SUPPRESS_DEVELOPER_WARNINGS 1 CACHE INTERNAL "No dev warnings")
endif ()
# From here, importing a file that uses <gtest/gest.h> should be about
# as simple as this, though # I don't fully understand the implications of the
# PUBLIC/PRIVATE/INTERFACE parameter as of yet.
add_executable(t_node t_node.cpp)
target_link_libraries(t_node PUBLIC gtest)
# ==============================================================================
#### Pre-compiled headers
Leaving this here as well, for my own memory
```cmake
target_precompile_headers(ajt INTERFACE ajt.hh
<string>
<vector>
<fstream>
<sstream>
<iostream>
<map>
<set>
<unordered_map>
<unordered_set>
<regex>
<exception>)
add_library(http INTERFACE http.hh)
target_precompile_headers(http)
add_library(request INTERFACE request.hh)
target_link_libraries(request INTERFACE http)
target_precompile_headers(request REUSE_FROM http)
add_library(response INTERFACE response.hh)
target_link_libraries(response INTERFACE http)
target_precompile_headers(response REUSE_FROM http)
add_executable(network remotehosting.cpp)
target_link_libraries(remotehosting INTERFACE request response)
Namespace Aliasing
Leaving this here because I want to use it at some point, but when
I'm not doing a graded assignment
#include <string>
#include <vector>
#include <list>
#include <regex>
#include <set>
#include <map>
#include <unordered_set>
#include <unordered_map>
#include <queue>
#include <sstream>
#include <fstream>
#include <iostream>
namespace etc {
template <typename K, typename V>
using map = std::unordered_map<K, V>;
template<typename K, typename V>
using multimap = std::unordered_multimap<K,V>;
template<typename K, typename V>
using treemap = std::map<K, V>;
template<typename K, typename V>
using multitreemap = std::multimap<K, V>;
template <typename T>
using set = std::unordered_set<T>;
template<typename T>
using multiset = std::unordered_multiset<T>;
template<typename T>
using treeset = std::set<T>;
template<typename T>
using multitreeset = std::multiset<T>;
template<typename T>
using list = std::vector<T>;
template<typename T>
using linkedlist = std::list<T>;
template<typename T>
using heap = std::priority_queue<T>;
}
Utility Functions
#include <fstream>
#include <sstream>
#include <iotream>
int main() {
// read from standard input
std::ifstream ifile("/dev/fd/0");
// write to standard output
std::ofstream ofile("/dev/fd/1");
std::stringstream sfile;
// slurp the input into a string stream
sfile << ifile.rdbuf();
// output the string contained by the stream
ofile << sfile.str();
}
- Checking if a file is empty
std::ifstream ifile;
ifile.open(FILEPATH);
// if the file is empty, return an empty string
if (ifile.peek() == std::ifstream::traits_type::eof()) {
return std::string{};
}
Operator Overloading
In the example below, I create a class number, and overload the addition
operator, such that instead of adding 1 and 2 together, the numbers
are instead concatenated, causing the value of number charlie to equal 3.
#include <iostream>
#include <string>
struct number {
number(int val): val(val) {}
number operator+(number rhs) {
return number(std::stoi(std::string(std::to_string(this->val) + std::to_string(rhs.val))));
}
friend std::ostream& operator<<(std::ostream& os, const number& num) {
os << num.val;
return os;
}
int val;
};
int main(int argc, char** argv) {
number alpha(1);
number bravo(2);
number charlie = alpha + bravo;
std::cout << charlie << std::endl;
}
Output
GCC
Flags
Common Flags
-E- Have GCC run the pre-processer and output the resulting source code.
-S- Have GCC run the assembler and output the intermediate assembly code.
-C- Have GCC produce only the final result after compiled code
-I DIR- When the codebase refers to files in a
#include directive, check the contents of DIR for a matching filename. -DMACRO=DEFINITION- Define a macro
MACRO with definition DEFINITION, same as saying #define MACRO = DEFINITION in your source files. -BPREFIX- This option specifies where to find the executables, libraries, include files, and data files of the compiler itself.
-l LIBRARY- Search the library named
LIBRARY when linking. (The second alternative with the library as a separate argument is only for POSIX compliance and is not recommended.) -LLIBRARY- When we link a library with the
-l flag, check the contents of the directory LIBRARY for a match. (e.g.: a valid match to -Lfolder-lblawould befolder/libbla.so`)
Useful Flags
-s- Remove all symbol table and relocation information from the executable.
-nostdinc++- Do not search for header files in the standard directories specific to C++ .
-nostdinc- Do not search for header files in the standard directories specific to C.
-march=native + -mtune=native- Enables optimizations which work specifically on your cpu and might not on others.
-pthread- Define additional macros required for using the POSIX threads library. You should use
this option consistently for both compilation and linking. This option is supported on
GNU/Linux targets, most other Unix derivatives, and also on x86 Cygwin and MinGW
targets.
-qopenmp-simd: Enable vector optimization to improve performance of the parallel STL
-fPIC- Compile the codebase into position independent code, useful for making shared libraries.
-fPIE- Compile the codebase into a position independent executable useful for securing code.
-fsanitize=pointer-compare- Ensure that their's no comparasson between two pointers from different objects using the relational operators. The option must be combined with -fsanitize=address
-fsanitize=address:- Enable the address sanitizer to check for possible memory errors.
-fdiagnostics-color=aut- Use color in diagnostic messages only when the standard error is a terminal
-fdiagnostics-show-template-tree- Have the compiler errors more specifically identify mismatches between two templates, including a colorized tree as a visual aid.
-fsanitize=thread- Enable ThreadSanitizer, a fast data race detector. Memory access instructions are instrumented to detect data race bugs.
-fsanitize=leak- Enable LeakSanitizer, a memory leak detector. This option only matters for linking of executables and the executable is linked against a library that overrides "malloc" and other allocator functions.
-fdiagnostics-show-template-tree- Have the compiler errors more specifically identify mismatches between two templates, including a colorized tree as a visual aid.
-fstack-check- Generate code to verify that you do not go beyond the boundary of the stack. You
should specify this flag if you are running in an environment with multiple threads,
but you only rarely need to specify it in a single-threaded environment since stack
overflow is automatically detected on nearly all systems if there is only one stack.
-c or -S or -E- prevent the linker from running at the end.
Later on, I hope to add information on the following: -ftabstop -fdirectives-only -M -MD -MMD -ftrack-macro-expansion -fpch-deps -fpch-preprocess -fworking-directory -C -CC -H -fdebug-cpp -nostdlib -nolibc -nodefaultlibs -nostartfiles -r -s -static -shared
Warning Flags
-Wall- Enable a sensible list of default warnings.
-Werror- Treat the detection of warnings as a failure of the build process.
-Wmissing-include-dirs- Warn if a user-supplied include directory does not exist.
-Wundef- Warn if an undefined identifier is evaluated in an "#if" directive. Such identifiers are replaced with zero.
-Wredundant-decls- Warn if anything is declared more than once in the same scope,
even in cases where multiple declaration is valid and changes nothing.
-Winvalid-pch- Warn if a precompiled header is found in the search path but cannot be used.
- `-Wlarger-than=BYTE_SIZE
- Warn whenever an object is defined whose size exceeds BYTE_SIZE.
-Weffc++- Warn about violations of the following style guidelines from
Scott Meyers' Effective C++ series of books:
- Define a copy constructor and an assignment operator for classes with
dynamically-allocated memory.
- Prefer initialization to assignment in constructors.
- Have
operator= return a reference to *this. - Don't try to return a reference when you must return an object.
- Distinguish between prefix
++i and postfix i++ forms of increment and decrement operators. - Caution against overload the
&&, ||, or , operators.
Debug Flags
-ggdb- Produce debugging information for use by GDB. This means to use the most expressive
format available (DWARF, stabs, or the native format if neither of those are
supported), including GDB extensions if at all possible.
-gdwarf- Produce debugging information in DWARF format (if that is supported).