uncompress           package:Rcompression           R Documentation

_U_n_c_o_m_p_r_e_s_s _i_n-_m_e_m_o_r_y _d_a_t_a _i_n _e_i_t_h_e_r _z_l_i_b _o_r _G_N_U _z_i_p _f_o_r_m_a_t

_D_e_s_c_r_i_p_t_i_o_n:

     These functions take a raw vector in R that contains data that
     is compressed via some algorithm and attempts to
     inflate/uncompress the data so it can be processed directly with
     other tools.  These function support the basic zlib algorithm, GNU
     zip format  and bunzip2-style compression. The difference between
     these and functions such as 'gzcon' is that these work with data
     already in memory, not in files.  These are also currently not
     connections.

_U_s_a_g_e:

     uncompress(content, size = length(content) * 10, asText = TRUE,
                 resizeBy = 2)
     gunzip(content, size = length(content) * 5, asText = TRUE)
     bunzip2(content, size = length(content) * 5, verbose = 0, asText = TRUE)

_A_r_g_u_m_e_n_t_s:

 content: the raw vector in R that contains the entire stream of
          compressed data.

    size: a value that guesses how big the resulting inflated contents
          will be. This is used to create a buffer that will hold the
          results.  If this is not sufficiently large, the buffer will
          be expanded so it is not vital that this be correct. However,
          getting it about right and a little bigger than the actual
          resulting size of the inflated content will make the
          processing faster and not waste excess memory. 

 verbose: an integer between 0 and 4 inclusive with larger values
          indicating to display more information on the console about
          the inflation process as it is being performed. 0 means no
          information. 

  asText: a logical value indicating whether to return the result as a
          simple R string ('TRUE') or leave it uninterpreted as a 'raw'
          vector.  'FALSE' is appropriate if the uncompressed contents
          are binary data.

resizeBy: the factor by which the internal memory buffer used to store
          the uncompressed data is grown when more memory is needed.
          The default is to double the size. In the worst case, this
          might mean needing one more byte to complete the
          de-compression, but asking for '2 * (total_length - 1)'. This
          parameter allows the caller to provide  a way to control this
          expansion with contextual knowledge. 

_V_a_l_u_e:

     The inflated contents either as a string or a raw vector.

_A_u_t_h_o_r(_s):

     Duncan Temple Lang

_R_e_f_e_r_e_n_c_e_s:

     zlib and bzip2

_S_e_e _A_l_s_o:

     'compress'

_E_x_a_m_p_l_e_s:

      x = "A string to compress"
      g = compress(x)
      uncompress(g) == x

      # Read contents of a GNU zipped file and uncompress the contents
      # directly from memory.  We could do this with gz() directly from R
      # but this is just arranging to get the raw data locally rather than,
      # e.g. from an HTTP request

      f = system.file("sampleData", "NAMESPACE.gz", package = "Rcompression")
      con = file(f, "rb")
      fs = file.info(f)$size
      content  = readBin(con, raw(fs), fs)
      close(con)
      gunzip(content, 10000)

