Multi-DNS Lookup

The first example is a multi-DNS lookup procedure. This example is shamelessly stolen from the paper Events Can Make Sense by Krohn, Kohler, and Kaashoek.

Sequential Version

1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
  
  
  
  
  
  
  
  
  
  
void multi_dns_seq(
    
size_t N, char **names, struct addrinfo **infos )
{
    
size_t i;
    
for( i = 0; i < N; ++i )
    
{
        
assert( 0 == getaddrinfo(
            
names[i], NULL, NULL, &infos[i] ) );
    
}
}

In multi_dns_seq, we sequentially perform N DNS lookups and return the results in infos. Obviously the error handling is pretty bare-bones here; I encourage you try to look past that.

Concurrent Version

If we have a non-trivial number of names to look up, we might be able to get through them faster by putting out several requests simultaneously. Here's how we can do that in Charcoal.

1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
void multi_dns_conc(
    
size_t N, char **names, struct addrinfo **infos )
{
    
size_t i, done = 0;
    
semaphore_t done_sem;
    
sem_init( &done_sem, 0 );
    
for( i = 0; i < N; ++i )
    
{
        
activate ( i )
        
{
            
assert( 0 == getaddrinfo(
                
names[i], NULL, NULL, &infos[i] ) );
            
if( ( ++done ) == N )
                
sem_inc( &done_sem );
        
}
    
}
    
sem_dec( &done_sem );
}

The most significant change here is the activate expression that starts on line 9. The expression or statement that follows the activate keyword (in this example, the block from line 10 to 15) is run concurrently with the activating code. The activated statement runs in a new activity. Activities are cooperative in the sense that at most one activity can be executing at a given time (in a given process/OS thread), and activities are never preempted to run peer activities.

Activities are a kind of cooperative thread.

Activated statements can read and write variables declared in their surrounding scope. There are several instances in this example code: N, i, done. Some care must be taken with these shared variables. First, the programmer has to decide whether the new activity should get a variable by-value or by-reference. This is very much like the identically named options for parameter passing in C++. The default is by-reference. For instance, all activities in this example share the same done variable, and see updates made by other activites. can use any variable that a "normal" statement could use, but the programmer has to make an important choice analogous to by-value/by-reference parameter passing in C++. By default variables are treated as references. In the example, the done variable is modified by multiple activities. This works fine. In contrast, each activity needs to know what the value of i was at activity creation time. That is why we put it in the by-value variable list. The newly created activity gets its own copy of each of the variables in this list.

What if we leave out the synchronization with done_sem?

What if we leave i off the by-value variables list?

What if we add something like done to the by-value variables list?

Concurrent Version with Simultaneous Lookup Limit

One weakness of our first concurrent multi-DNS lookup is that it tries to look up all the names simultaneously. If the list of names is long, this might overwhelm the operating system or network. We can limit the number of outstanding look ups with a few modest changes to the code:

1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
21:
22:
23:
24:
25:
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
#define DEFAULT_MULTI_DNS_LIMIT 10

void multi_dns_conc(
    
size_t N, size_t lim,
    
char **names, struct addrinfo **infos )
{
    
size_t i, done = 0;
    
semaphore_t done_sem, lim_sem;
    
sem_init( &done_sem, 0 );
    
sem_init( &lim_sem,
        
lim < 1 ? DEFAULT_MULTI_DNS_LIMIT : lim );
    
for( i = 0; i < n; ++i )
    
{
        
sem_dec( &lim_sem );
        
activate ( i )
        
{
            
assert( 0 == getaddrinfo(
                
names[i], NULL, NULL, &infos[i] ) );
            
sem_inc( &lim_sem );
            
if( ( ++done ) == n )
                
sem_inc( &done_sem );
        
}
    
}
    
sem_dec( &done_sem );
}

Here we added a new semaphore (lim_sem) that the main activity decrements before activating each look up. The look up activities increment the limit semaphore after completing the lookup, allowing subsequent activations to occur.

An alternative pattern that would be more appropriate if we were using processes or threads is for the main code to create a pool of lim processes/threads/activities that would then carefully work their way through the list of names. This worker pool pattern is less clear for applications like this, and is only preferable if creating new activities is expensive. In Charcoal, activating a statement is very cheap — much closer to the cost of a procedure call than the cost of process creation.

1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
21:
22:
23:
24:
25:
26:
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
#define DEFAULT_MULTI_DNS_LIMIT 10

void multi_dns_conc(
    
size_t N, size_t lim,
    
char **names, struct addrinfo **infos
    
semaphore_t *, struct addrinfo **infos )
{
    
size_t i, done = 0;
    
semaphore_t done_sem, lim_sem;
    
sem_init( &done_sem, 0 );
    
sem_init( &lim_sem,
        
lim < 1 ? DEFAULT_MULTI_DNS_LIMIT : lim );
    
for( i = 0; i < n; ++i )
    
{
        
sem_dec( &lim_sem );
        
activate ( i )
        
{
            
assert( 0 == getaddrinfo(
                
names[i], NULL, NULL, &infos[i] ) );
            
sem_inc( &lim_sem );
            
if( ( ++done ) == n )
                
sem_inc( &done_sem );
        
}
    
}
    
sem_dec( &done_sem );
}

Creative Commons License
Charcoal Programming Language by Benjamin Ylvisaker is licensed under a Creative Commons Attribution 4.0 International License