Read two files after skipping line starting with ## from one file

I want to read two files concurrently after skipping lines starting with ## from one file.

file1.txt:
##
##
##
header1 header2

file2.txt:
header3 header4

Is there any way to skip ## and afterwards read lines in both files concurrently?

open IN1, "file1.txt";
open IN2, "file2.txt";

if <IN1> ^## skip
while(my $one = <IN1>, my $two = <IN2>){
    print "$one\t$two";

}

Outputs:
header1 header2    header1 header2

2 answers

  • answered 2018-01-11 21:01 Jeremy Jones

    In the loop, skip file1's lines until they're valid, and last out when either of the files is done:

    open my $file1, "<", "file1.txt" or die $!;
    open my $file2, "<", "file2.txt" or die $!;
    
    while (1) {
        my $file1_line = <$file1>;
        next if $file1_line =~ /^##/;  # skip commented lines in file1
    
        my $file2_line = <$file2>;
    
        last if not ($file1_line and $file2_line);
    
        chomp $file1_line;
        chomp $file2_line;
    
        print "$file1_line\t$file2_line\n";
    }
    
    close $file1;
    close $file2;
    

    Output:

    $ cat file1.txt
    ##
    ##
    ##
    header1 header2
    
    $ cat file2.txt 
    header3 header4
    
    $ perl mysolution.pl
    header1 header2 header3 header4
    
    $ 
    

  • answered 2018-01-11 22:01 zdim

    Skip through each file until you reach its marker, then continue reading them in a new loop

    use warnings;
    use strict;
    use feature 'say';
    
    my ($file1, $file2) = @ARGV;
    die "Usage: $0 file1 file2\n" if !$file1 or !$file2;
    
    open my $fh1, '<', $file1 or die "Can't open $file1: $!";
    open my $fh2, '<', $file2 or die "Can't open $file2: $!";
    
    # Second file's empty marker means it reads it from the beginning
    my ($re_marker1, $re_marker2) = (qr/^##/, qr//);
    
    while (<$fh1>) { last if /$re_marker1/ }; 
    while (<$fh2>) { last if /$re_marker2/ };
    
    while (1) { 
        my $l1 = <$fh1>; 
        my $l2 = <$fh2>; 
        chomp ($l1, $l2); 
    
        say "$l1  |  $l2"; 
    
        last if eof $fh1 or eof $fh2;
    }
    

    After the two while loops the filehandles $fh1 and $fh2 are positioned so to read the next line after the line with marker that was just read, each for its file.

    So then you continue reading from them in another loop. That loop exits once either of the files has read its last line, checked by eof (returns 1 if the next read will be end-of-file). You can then test filehandles to see which one has still unread lines, if you need to process files further.

    Note that we seldom need eof, this being one of rare situations where it is just the right tool.

    The markers can be made into command-line options, as well. This does make a few simple assumptions around some unspecified problem details, inferred from the question.